Actions

The agent within THOR can perform several actions that allow it to interact with the scenes within the environment. Actions are defined in unity/Assets/Scripts/PhysicsRemoteFPSAgentController.cs. The Object Types section has more information about object properties and interactions.

Prior to running the commands below it is assumed you have run the following:

import ai2thor.controller
controller = ai2thor.controller.Controller()
controller.start()
# can be any one of the scenes FloorPlan###
controller.reset('FloorPlan28')
controller.step(dict(action='Initialize', gridSize=0.25))

Initialize

Initialize must be called after resetting a scene to set fields such as gridSize.

controller.step(dict(action='Initialize', gridSize=0.25))
Parameter Type Description Default
gridSize float Size of the grid that the agent navigates in. This determines the step size that the agent takes when the actions MoveAhead, MoveBack, MoveRight, MoveLeft are taken. 0.25
renderDepthImage bool When enabled a depth image is sent and made available on the returned Event as the attribute depth_frame. False
renderClassImage bool When enabled a class segmentation image is sent and made available on the returned Event as the attribute class_segmentation_frame. False
renderObjectImage bool When enabled an object segmentation image is sent and made available on the returned Event as the attribute instance_segmentation_frame. False
visibilityDistance float Distance in meters from the agent’s camera (positioned near the top of the agent) that an object should be considered visibile. 1.0
cameraY float Height of the camera attached to the agent. 0.675
fieldOfView float Field of view for the agent’s camera. Corresponds to the value Camera.fieldOfView 60.0



Object Position Randomization

After initializing the scene, pickupable objects can have their default positions randomized to any valid receptacle they could be placed in within the scene.

controller.reset('FloorPlan28')
controller.step(dict(action='Initialize', gridSize=0.25))
controller.step(dict(action = 'InitialRandomSpawn', randomSeed = 0, forceVisible = false, maxNumRepeats = 5))

Remember to reset and initiialize the scene before using the Position Randomizer, otherwise seeded values will be innacurate.

Parameter Type Description Default
randomSeed int Used to seed the randomization for duplicatable scene states. Because this seed depends on the current state of the scene, remember to reset the scene with controller.reset() before running InitialRandomSpawn(), otherwise the seeded randomization will not be accurate 0
forceVisible bool When enabled, the scene will attempt to randomize all moveable objects outside of receptacles in plain view. Use this if you want to avoid objects spawning inside closed drawers, cabinets, etc. False
maxNumRepeats int how many times each object in the scene attempts to randomly spawn. Setting this value higher will lead to fewer spawn failures at the cost of performance 5

AddThirdPartyCamera

Add a third party camera to the scene. See Event / Metadata section for information on how to receive the frame.

controller.step(
    dict(
        action='AddThirdPartyCamera',
        rotation=dict(x=0, y=90, z=0),
        position=dict(x=-1.25, y=1.0, z=-1.0)
        )
    )



UpdateThirdPartyCamera

Update the position/rotation of a thirdPartyCamera. Both rotation and position are required.

controller.step(
    dict(
        action='UpdateThirdPartyCamera',
        thirdPartyCameraId=0, # id is available in the metadata response
        rotation=dict(x=0, y=90, z=0),
        position=dict(x=-1.25, y=1.0, z=-1.5)
        )
    )



Agent Navigation

The agent can use these actions to navigate through the environment.


RotateRight

Rotate the agent by 90 degrees to the right of its current facing

event = controller.step(dict(action='RotateRight'))

RotateLeft

Rotate the agent by 90 degrees to the left of its current facing

event = controller.step(dict(action='RotateLeft'))

LookUp

Angle the agent’s view up in 30 degree increments (max upward angle is 30 degrees above the forward horizon)

event = controller.step(dict(action='LookUp'))

LookDown

Angle the agent’s view down in 30 degree increments (max downward angle is 60 degrees below the forward horizon)

event = controller.step(dict(action='LookDown'))

MoveAhead

Move the agent forward by gridSize.

event = controller.step(dict(action='MoveAhead'))
Parameter Type Description Default
moveMagnitude float Specify move distance and overwrite gridSize value 0.0 (will default to gridSize)

MoveRight

Move the agent right by gridSize (without changing view direction).

event = controller.step(dict(action='MoveRight'))
Parameter Type Description Default
moveMagnitude float Specify move distance and overwrite gridSize value 0.0 (will default to gridSize)

MoveLeft

Move the agent left by gridSize (without changing view direction).

event = controller.step(dict(action='MoveLeft'))
Parameter Type Description Default
moveMagnitude float Specify move distance and overwrite gridSize value 0.0 (will default to gridSize)

MoveBack

Move the agent backward by gridSize (without changing view direction).

event = controller.step(dict(action='MoveBack'))
Parameter Type Description Default
moveMagnitude float Specify move distance and overwrite gridSize value 0.0 (will default to gridSize)

Agent Teleportation

Use these actions to move the agent through the scene by warping to specified points rather than having to navigate to positions step by step.


Teleport

Move the agent to any location in the scene. Using this command it is possible to put the agent into places that would not normally be possible to navigate to, but it can be useful if you need to place an agent in the exact same spot for a task.

controller.step(dict(action='Teleport', x=0.999, y=1.01, z=-0.3541))
Parameter Type Description Default
x float x coordinate in 3D scene space 0.0
y float y coordinate in 3D scene space 0.0
z float z coordinate in 3D scene space 0.0

TeleportFull

Move the agent to any location in the scene. Using this command it is possible to put the agent into places that would not normally be possible to navigate to, but it can be useful if you need to place an agent in the exact same spot for a task. Identical to Telport, but also allows rotation and horizon to be passed in.

event = controller.step(dict(action='TeleportFull', x=0.999, y=1.01, z=-0.3541, rotation=90.0, horizon=30.0))
Parameter Type Description Default
x float x coordinate in 3D scene space 0.0
y float y coordinate in 3D scene space 0.0
z float z coordinate in 3D scene space 0.0
rotation float Rotation about the Y axis to change the forward orientation of the Agent relative to world x/z axes 0.0
horizon float Rotation about the X axis to change the Up/Down look angle of the Agent. Any angle can be used here, but values of -30.0, 0.0, 30.0, 60.0 will mimic the maximum and minimum angles used by the LookUp and LookDown actions 0.0

The horizon angle values describe the rotation about the Agent’s X-Axis. This axis has “right hand” facing with respect to the forward Z-Axis, and because of this the values are slightly misleading as the (-30.0) horizon will actually angle the agent’s forward Z direction 30 degrees upward. Because horizon values describe changes about the X-Axis, positive and negative angles can result in the same end position.

Horizon Angle Value Change in Forward Z
-30.0 (330.0) Look 30 Degrees Up
0.0 Look straight ahead
30.0 Look 30 Degrees Down
60.0 Look 60 Degrees Down

Get Reachable Positions

Returns valid coordinates that the Agent can reach without colliding with the environment or Sim Objects. This can be used in tandem with Teleport to warp the Agent as needed. This is useful for things like randomizing the initial position of the agent without clipping into the environment.

event = controller.step(dict(action='GetReachablePositions'))

Object Interaction

These actions allow the agent to interact with Sim Objects in the scene in various ways.


Pickup Object

Pick up an Interactable object specified by objectID and move it to the Agent’s Hand. Note that the agent’s hand must be clear of obstruction- if the target object being in the Agent’s Hand would cause it to clip into the environment, this will fail.

Picked up objects can also obstruct the Agent’s view of the environment since the Agent’s hand is always in camera view, so know that picking up larger objects will obstruct the field of vision.

Moveable Receptacles: Note that certain objects are Receptacles that can themselves be picked up. If a moveable receptacle is picked up while other Sim Objects are inside of it, the contained objects will be picked up with the moveable receptacle. This allows for sequences like “Place Egg on Plate -> Pick Up Plate” to move both the Plate and Egg.

event = controller.step(dict(action='PickupObject', objectId="Mug|0.25|-0.27"))
Parameter Type Description Default
objectId string the string unique id of the target object null

Put Object

Attempt to Put an object the Agent is holding onto/in the target Receptacle.

event = controller.step(dict(action='PutObject', objectId = "Tomato|0.1|3.2|0.43", receptacleObjectId = "TableTop|0.25|-0.27|0.95"))
Parameter Type Description Default
objectId string The string unique id of the object in the Agent’s hand that it is trying to put down. Note that an error will be thrown if trying to put an object the Agent is not holding null
receptacleObjectId string string unique id of target receptacle to attempt putting the object in/on. null
forceAction bool Enable to ignore any Receptacle Restrictions when attempting to place objects. Normally objects will fail to be put on a receptacle if that receptacle is not valid for the object. See the Receptacle Object Types page for more details. False
placeStationary bool If placeStationary = False is passed in, a placed object will use the physics engine to resolve the final position. This means placing an object on an uneven surface may cause inconsistent results due to the object rolling around or even falling off of the target receptacle. Note that because of variances in physics resolution, this placement mode is non-determanistic! If placeStationary = True, the object will be placed in/on the valid receptacle without using physics to resolve the final position. This means that the object will be placed so that it will not roll around. For determanistic placement make sure to set to true! True

Drop Object

Drop a held object and let Physics resolve where it lands. Note that this is different from the Put Object function, as this does not guarantee the held object will be put into a specified receptacle. This is meant to be used in tandem with the Move/Rotate Hand functions to maneuver a held object to a target area, and the let it drop.

Additionally, this Drop action will fail if the held object is not clear from all collisions. Most importantly, the Agent’s collision will prevent Drop, as dropping an object if it is “inside” the agent will lead to unintended behavior.

event = controller.step(dict(action='DropHandObject')))

Throw Object

An extention of the Drop function-throw a held object in the current forward direction of the Agent at a force specified by moveMagnitude. Because objects can have different Mass properties, certain objects will require more or less force to push the same distance.

event = controller.step(dict(action='ThrowObject', moveMagnitude= 150.0 )))
Parameter Type Description Default
moveMagnitude float The amount of force used to throw the object. Note that objects of different masses will have different throw distances if this magnitude is not changed 0.0

OpenObject

Open an object specified by objectID.

The target object must be within range of the Agent and Interactable in order for this action to succeed. An object can fail to open if it hits another object as it is opening. In this case the action will fail and the target object will reset to the position it was last in.

event = controller.step(dict(action='OpenObject', objectId="Fridge|0.25|0.75"))

Here is an example of opening the Fridge halfway:

event = controller.step(dict(action='OpenObject', objectId="Fridge|0.25|0.75", moveMagnitude = 0.5))
Parameter Type Description Default
objectId string the string unique id of the target object null
moveMagnitude float Pass a magnitude between 0.0 and 1.0 to open by the corresponding percentage. For example, a magnitude of 0.5 will cause the object to open halfway, a value of 0.25 will open the object a quarter of it’s full open position 0.0

CloseObject

Close an object specified by objectID.

The target object must be within range of the Agent and Interactable in order for this action to succeed. An object can fail to open if it hits another object as it is closing. In this case the action will fail and the target object will reset to the position it was last in.

event = controller.step(dict(action='CloseObject', objectId="Fridge|0.25|0.75"))
Parameter Type Description Default
objectId string the string unique id of the target object null

Toggle On

Toggles an object specified by objectID into the On state if applicable. Noteable examples are Lamps, Light Switches, and Laptops.

event = controller.step(dict(action='ToggleObjectOn', objectId= "LightSwitch|0.25|-0.27|0.95")))
Parameter Type Description Default
objectId string the string unique id of the target object null

Toggle Off

Toggles an object specified by objectID into the Off state if applicable. Noteable examples are Lamps, Light Switches, and Laptops.

event = controller.step(dict(action='ToggleObjectOff', objectId= "LightSwitch|0.25|-0.27|0.95")))
Parameter Type Description Default
objectId string the string unique id of the target object null

Object Manipulation

After the agent has picked up a Sim Object that is pickupable, these actions can be used to manipulate held items in various ways.


Move Hand Forward

Moves the Agent’s hand and held object forward relative to the agent’s current facing. The hand can only be moved if it is holding an object.

event = controller.step(dict(action='MoveHandAhead', moveMagnitude = 0.1))
Parameter Type Description Default
moveMagnitude float The distance, in meters, to move the hand in this direction 0.0

Move Hand Back

Moves the Agent’s hand and held object backward relative to the agent’s current facing. The hand can only be moved if it is holding an object.

event = controller.step(dict(action='MoveHandBack', moveMagnitude = 0.1))
Parameter Type Description Default
moveMagnitude float The distance, in meters, to move the hand in this direction 0.0

Move Hand Left

Moves the Agent’s hand and held object left relative to the agent’s current facing. The hand can only be moved if it is holding an object.

event = controller.step(dict(action='MoveHandLeft', moveMagnitude = 0.1))
Parameter Type Description Default
moveMagnitude float The distance, in meters, to move the hand in this direction 0.0

Move Hand Right

Moves the Agent’s hand and held object right relative to the agent’s current facing. The hand can only be moved if it is holding an object.

event = controller.step(dict(action='MoveHandRight', moveMagnitude = 0.1))
Parameter Type Description Default
moveMagnitude float The distance, in meters, to move the hand in this direction 0.0

Move Hand Up

Moves the Agent’s hand and held object up relative to the agent’s current facing. The hand can only be moved if it is holding an object.

event = controller.step(dict(action='MoveHandUp', moveMagnitude = 0.1))
Parameter Type Description Default
moveMagnitude float The distance, in meters, to move the hand in this direction 0.0

Move Hand Down

Moves the Agent’s hand and held object down relative to the agent’s current facing. The hand can only be moved if it is holding an object.

event = controller.step(dict(action='MoveHandDown', moveMagnitude = 0.1))
Parameter Type Description Default
moveMagnitude float The distance, in meters, to move the hand in this direction 0.0

Rotate Hand

Rotates the hand and held object about the specified axes (x, y, z) to the specified degree. These examples rotate a held object to 90 degrees about each axis.

event = controller.step(dict(action='RotateHand', x = 90))
event = controller.step(dict(action='RotateHand', y = 90))
event = controller.step(dict(action='RotateHand', z = 90))

Multiple Axes can be specified at once as well.

event = controller.step(dict(action='RotateHand', x = 90, y = -15, z = 28))
Parameter Type Description Default
x float rotation about the object’s x axis 0.0
y float rotation about the object’s y axis 0.0
z float rotation about the object’s z axis 0.0


Failures

It is possible depending on where the agent is located within a scene that an action fails. In the case of the Move actions, it will likely be due to some collision with an object in the agent’s path. PutObject or PickupObject failures are due to the visibility of either the target object to be picked up or the receptacle object in the case of PutObject.

Next Steps

Continue on to the Event/Metadata documentation.