Intelligent Beings Learn The Necessary Skills Through Interaction With The World. Now AI Researchers Are Planning To Teach Virtual Agents New Skills Using A Similar Strategy.
In 2009, Fei-Fei Li, a computer scientist at Princeton University, prepared a dataset that had a significant impact on the world of artificial intelligence.
This dataset, called ImageNet, contains millions of labeled images that can train sophisticated machine learning models to recognize particular objects in an image.
Finally, in 2015, machines outperformed humans in the face recognition process. Soon after, Lee began researching a project he called Polestar, which he believes will lead artificial intelligence in a different direction toward accurate intelligence.
Lee studied and was inspired by the Cambrian period, which happened nearly 530 million years ago and in which various animal species first appeared. A thought-provoking theory states, “The emergence of new species was partly due to the emergence of eyes that could see the world around them for the first time.” Lee realized that they developed the animal vision process for movement, orientation, survival, and adaptation to the surrounding environment. “This natural problem led me to look for a more active vision for AI,” he says.
Ms. Lee’s field of activity on artificial intelligence agents is not limited to validating still images based on a dataset but is studying intelligent agents that can move around and interact with the environment in 3D virtual world simulations.
This general goal is the basis for the emergence of a new field known as ” Embodied AI.”
Embodied AI overlaps with robotics because robots can be the physical equivalent of real-world embodied AI, and reinforcement learning agents always look to learn and do better based on long-term rewards and essential incentives.
Ms. Li and other researchers believe embodied artificial intelligence can make a fundamental difference, elevating simple machine learning capabilities, such as image recognition, to learning how to perform complex multi-step tasks. Phi Li, who created the ImageNet dataset, produced a standard set of virtual activities to evaluate these learning machines’ progress.
Research in the field of embodied artificial intelligence includes the training of agents that can examine their surroundings and make changes in them if necessary. While in robotics, the intelligent agent always appears in the form of a physical concept, something like a robotic arm.
Modern agents in realistic simulations may have a virtual body or perceive the world from the perspective of a moving camera that can interact with the surrounding environment.
“The word embodiment here does not refer to the physical nature, but the interaction and doing of the things you do in the environment,” says Lee.
With a series of new virtual worlds entering the technological world, embodied AI agents will make significant progress in understanding new environments. This interaction provides agents with a new, and in many cases better, way to learn about the world around them. This new understanding of the surrounding environment helps the agent become smarter.
Viviane Clay, an artificial intelligence researcher at the University of Osnabrück in Germany, says: “Currently, we have no evidence that an AI agent can learn by interacting with the world.”
Moving on to a full simulation
While researchers have long sought to create real virtual worlds for artificial intelligence agents, serious research in this area is not more than five years old, as the capabilities of GPUs have improved significantly over the years. The film industry and video games have allowed us to create influential graphic works that can use in interaction with virtual environments.
In 2017, artificial intelligence agents could enter virtual worlds and depict interior spaces realistically. AI2-Thor simulator developed by computer scientists at the Allen Institute for Artificial Intelligence Built-in allows agents to tour kitchens, bathrooms, living rooms, and bedrooms just like in the real world. Agents can study 3D views that change as they move, providing us with unique information when they take a closer look at the environment and objects.
These new worlds allow agents to reason about changes in a new dimension, time. Manolis Savva, a computer graphics researcher at Simon Fraser University who has created several virtual worlds, says:
“We are building a new and completely different concept that is supposed to serve embodied artificial intelligence. You have access to a regular and integrated flow of information that you can control.”
These simulated worlds work well enough to best train agents to do new things. More specifically, agents can interact with, pick up, and move around an object instead of just recognizing it. These activities seem like small steps, but every intelligent agent needs these small steps to understand their surroundings.
It is not harmful to know that in 2020, virtual agents had to use a skill beyond vision to hear the sounds produced by virtual objects, and in this way, they marked a new chapter in the field of learning and how to do things.
Of course, this does not mean that the work is over. Daniel Yamins, a computer scientist at Stanford University, says: “The work done so far on simulated environments and embodied artificial intelligence is negligible compared to the real world.”
Yamens and his colleagues at MIT and IBM have succeeded in developing the virtual environment ThreeDWorld, which is very similar to the real world, and added things like the reaction of liquids when spilled on different surfaces.
“This is backbreaking work and a great research challenge to help train artificial intelligence agents based on new learning methods,” says the Sava in this connection.
Comparison of neural networks
A simple way to measure the progress of embodied artificial intelligence is available to professionals so that the performance of embodied intelligent agents is compared with the performance of algorithms trained to perform simple tasks with static images.
However, the researchers note that these comparisons are not perfect, but preliminary results show that embodied AI agents learn differently, sometimes better than their predecessors.
The researchers found that the embodied artificial intelligence agent is more accurate in detecting some objects and performs almost 12% better than the current intelligent agents. “It took more than three years for embodied AI-based agents to reach this level of progress,” says Roozbe Motaghi, one of the paper’s authors and a computer scientist at the Allen Institute.
Object recognition by traditionally trained algorithms is improved when you let them be placed in a virtual environment to explore the virtual space or when you allow them to move around to gather multiple views of objects.
Researchers have found that embodied and traditional algorithms learn differently.
To prove this point, consider a neural network where the primary learning component is embodied and unrealized algorithms. A neural network is an algorithm with different layers, consisting of artificial nodes and neurons inspired by the web in the human brain.
The researchers found that neural networks in embodied agents have fewer active neurons in response to visual information; This means that each neuron acts more selectively to the stimulus. Non-visualized networks have a less accurate performance in this field. Also, more neurons are needed to keep them fully active.
“That’s not to say that embodied versions are better, but they do things differently,” says Grace Lindsey, a professor at New York University.
While comparing embodied neural networks with non-embodied neural networks is one measure of progress, researchers tend to improve embodied agents’ performance in specific tasks.
The real goal is to learn more complex and human-like tasks. For example, observation-based orientation is a large and attractive research area in this field.
Here, an agent must remember the long-term goal and the destination while planning to reach the goal without getting lost or bumping into objects.
In this regard, a team led by Dhruv Batra, director of Meta AI research and a computer scientist at the Georgia Institute of Technology, was able to improve the performance of intelligent agents in the field of point-to-target orientation.
“We provided the intelligent agent with a GPS and a compass and trained it to navigate a target in a meta-virtual world called AI Habitat,” says Batra. Here, an agent is placed in a completely new environment. Without a movement map, the agent moves to the target based on the coordinates provided by the experts (for example, move to a point 5 meters north and 10 meters east).
The intelligent agent reached the destination with more than 99.9% accuracy based on the standard data set. Next, we made it more difficult and gave the agent a more complex scenario to find his way without GPS or compass.
Dehru Batra’s team hopes to complete the virtual environment and simulation so that the intelligent agent can reach the predetermined goal in just 20 minutes.
“This is a wonderful development,” Motegi said. However, this does not mean that orientation is a finished task. In the real world, to perform specific tasks, certain orientations are required, which are based on complex instructions. For example, go through the kitchen to get the glasses on the bedside table in the bedroom. It is a complex process for intelligent algorithms.
Navigation is one of the simplest tasks in embodied artificial intelligence because agents move in an environment that has not been altered or manipulated. As of this writing, incorporated AI agents have no skills in working with new objects.
A big challenge in this field is that when the agent interacts with new objects, it may make frequent mistakes and use them as an incorrect source of experience. For this purpose, researchers have chosen the option of doing things in several steps to be able to solve this problem.
Still, most human activities, such as cooking or washing dishes, require sequential tasks involving different objects. To reach such a level of intelligence, AI agents need more effort.
Lee has developed a dataset he hopes will do for embodied AI like his ImageNet project did for AI object recognition.
He donated a large dataset of images to the AI community for access to standard input data to complement his work. Now, his team has released a common simulated dataset of 100 human-like activities for agents that can be tested in any virtual world.
By building benchmarks that can compare the performance of agents performing tasks similar to human studies, Lee’s new dataset makes it possible to assess the progress of virtual AI agents better.
“Once agents can perform complex tasks correctly, it’s time to train them in the more maneuverable space of the real world,” says Lee. In my opinion, simulation is one of the most important and exciting areas of robotics research.
The new frontier of robotics
Robots are inherently intelligent embodied agents. They are the most tangible form of AI agents, having some physical body in the real world, but most researchers have found that even these agents can benefit from training in the virtual world.
“Advanced algorithms in the field of robotics, such as reinforcement learning and such, typically require millions of iterations to learn meaningful concepts,” Motaghi says. “As a result, training real robots to perform difficult tasks may take years.”
However, training robots in virtual worlds makes the learning process faster than in the real world because various agents can be simultaneously located and trained in different environments. Also, virtual training is safer for robots that interact closely with humans.
When OpenAI researchers proved it possible to transfer skills from simulation to the real world, simulators gained extreme attention from robotics professionals.
They trained a robotic arm to build a cube it had only seen in simulations.
Despite the recent successes, it has become possible for drones to learn how not to collide with birds in space or objects at low altitudes and for self-driving cars to be able to drive without problems in urban environments on two different continents. In this regard, four-legged robots can experience a one-hour walk in the Swiss Alps.
Some researchers believe that when virtual and meta space reach a significant degree of evolution, humans will meet artificial intelligence in meta through virtual reality headsets to narrow the gap between simulations and the real world.
Dieter Fox, senior director of robotics research at NVIDIA and a professor at the University of Washington, notes that the primary goal of robotics research is to build robots that are useful to humans in the real world, but to do that, they must first meet humans and learn from them. Learn how to interact with humans.
“It would be interesting to use virtual reality to bring humans into simulated environments and enable conditions for them to interact with robots,” says Fox.
Embodied AI agents, both in simulations and in the real world, learn better at tasks that are repeated daily, just like humans. This category is progressing in all areas simultaneously.
“I see the convergence of deep learning, robotic learning, machine vision, and speech processing,” says Lee. In my opinion, we can achieve a higher level of artificial intelligence through the Polar Star project, which will bring important achievements.