Using synthetic data to train vision-based ML systems removes one of the industry’s most painful and expensive bottlenecks – acquiring and labelling training data from the real world – but it presents the data scientist with a new and unfamiliar set of problems to solve.
If the training objective is simply to detect objects in standardized settings, such as products sitting on a shelf, it’s a simple matter to set up a 3D scene that will do that. If the objective is more complex, such as arranging for an actor to navigate a store and pick up various products, the difficulty begins to rise to the point where advanced animation and modeling skills are needed. Unless this issue is addressed, using synthetic data simply moves the bottleneck from one place to another.
The solution adopted by Chameleon is to abstract the actor control mechanisms to a level where intentions and objectives can be set in a way that is intuitive to a non-expert and let the simulator take care of the details.
In order to set up a scenario the user can define sets of actors, give them intentions and interests, then set off the simulation and allow the scenario to develop.
In a shopping scenario this would take the form of choosing the demographic distribution of the actors and assigning it to a generator that will automatically set off shoppers into the store, setting their intention to reach the checkout. Each shopper generated in this way can be given a list of interests (ie: a shopping list) that will distract them from their intention of reaching the checkout. The objects of interest, called Attractors, are distributed around the store where the actors will automatically seek them out.
By balancing the level of interest in a particular product and its attractiveness (is it on sale?) the user can modify the priority of the shopper and produce variations in how they will navigate the store. Once the shopping list is filled, the shopper will then proceed to the checkout.
Chameleon provides tools for assigning these parameters according to various statistical distributions so that a wide variety of shoppers of different demographic profiles will navigate for a variety of products with different degrees of urgency and in different orders with different routes. Thus a huge diversity of activities can be generated from a simple, intuitive description of the scenario, and the user has no need for expertise in 3D programming or animation.
In Chameleon, actors are able to query the system about what actions to perform when they reach a particular attractor. This makes it easy to swap out attractors without disturbing the base scenario to produce greater diversity of activities.
Of course, it is always possible in Chameleon to manually place specific actors in precise locations and assign them fixed paths to specific locations. This is useful for assigning ‘hero’ actions that must be precisely controlled but in most cases, the ability to quickly and simply define a complex scene will produce abundant training data covering all needs. By combining these emergent scenarios with the manual placement of actors, it becomes possible to produce specific corner cases within realistic crowd activity.
The overall objective of the Chameleon toolset is to give data scientists the power to produce training data at will; the ability to produce complex scenarios from simple instructions is a key part of that.
Editors Note: This is the second in a series of posts by Mindtech’s VP Engineering, Peter McGuinness