Teaching robots social autonomy from in situ human guidance

See allHide authors and affiliations

Science Robotics  23 Oct 2019:
Vol. 4, Issue 35, eaat1186
DOI: 10.1126/scirobotics.aat1186
  • Fig. 1 Diagram of the application of SPARC to HRI.

    A human teacher supervises a robot learning to interact with another human (e.g., a child in the context of education).

  • Fig. 2 Setup used in the study.

    A child interacts with the robot tutor with a large touchscreen sitting between them, displaying the learning activity; a human teacher provides guidance to the robot through a tablet and monitors the robot’s learning. Although the picture depicts an early laboratory pilot, the main study was conducted on actual school premises.

  • Fig. 3 Comparison of policy between the supervised and autonomous robot.

    (A) Comparison of the number of actions of each type executed by the robot in the autonomous and supervised conditions. Each point represents how often the robot executed an action with a child (n = 25 per condition). (B) Timing between each action and the last eating event (due to their low or null number of execution, the actions “move to” and “move away” were not analyzed). Each point represents one execution of an action.

  • Fig. 4 Comparison of children’s behavior between the three conditions.

    (A) Number of different eating interactions produced by the children (corresponding to the exposure to learning units) for the four rounds of the game, for the three conditions. (B) Interaction time for the four rounds of the game for the three conditions. The dashed red line represents 2.25 min, the time at which unfed animals died without intervention, leading to an end of the game if the child did not feed animals enough.

  • Fig. 5 Summary of the action selection process in the supervised condition.

    Child number 1 corresponds to the beginning of the training; child number 25 corresponds to the end of the training. The “teacher-initiated actions” label represents each time the teacher manually selected an action not proposed by the robot.

  • Fig. 6 Simplified schematics of the architecture used to control the robot.

    A game (1) runs on a touchscreen between the child and the robot. (2) analyzes the state of the game using inputs from the game and the camera. (3) is an interface running on a tablet and used by the teacher to control and teach the robot. (4) communicates actions between the interface (3) and the learner (7). (5) translates teacher’s actions into robotic commands used by (6) and (8) and executed by the robot (9). Last, (7) is the learning algorithm, which defines a policy based on the state perceived and the previous actions selected by the teacher, their substates, and their feedback on propositions. The different nodes communicate using Robot Operating System (ROS).

  • Fig. 7 Flowchart of the action selection.

    Mixed-initiative control is achieved via a combination of actions selected by the teacher, propositions from the robot, and corrections of propositions by the teacher. The algorithm uses instances x, corresponding to a tuple: action, a; substate, s′; and reward, r. s′ is defined on S′ with S ′ ⊂ S and N′ the set of the indexes of the n′ selected dimensions of s′.

  • Table 1 Example of events during the first minute of the first round of the interaction with the 23rd child in the supervised condition.

    Events beginning with “robot” represent suggestions from the robot; events beginning with “teacher” are the reactions from the teacher. “mvc” is the abbreviation of the move close action, and times are provided in seconds. Words in italics refer to items on the screen and number (if applicable) of the specific item interacting.

    4.1Childtouch frog 32.5Childrelease dragonfly
    4.3Failinteraction frog wheat-334.4Childtouch wolf
    4.9Animaleats frog fly34.7Robot proposes remind rules
    5.8Childrelease frog35Animaleats wolf mouse
    6.6Robot proposes congrats36Teacher selects wait
    7.6Childtouch fly36Animaleats wolf mouse
    7.6Teacher selects wait37.2Childrelease wolf
    8Animaleats fly apple-437.7Childtouch grasshopper
    8.3Childrelease fly38.3Robot proposes congrats
    9.1Teacher selects congrats42.1Failinteraction grasshopper apple-1
    9.1Childtouch frog42.7Childrelease grasshopper
    10.3Childrelease frog42.7Failinteraction grasshopper apple-1
    10.8Childtouch frog44.4Teacher selects instead mvc mouse-wheat-1
    11.2Animaleats frog fly44.6Robottouch mouse
    12.4Failinteraction frog apple-244.7Childtouch butterfly
    12.5Animaleats frog fly45.1Failinteraction butterfly wheat-2
    13.2Childrelease frog45.6Childrelease wheat-1
    14.2Childtouch fly45.6Robotrelease mouse
    14.5Animaleats fly apple-245.7Robottouch mouse
    14.6Robot proposes encouragement48.9Robotrelease mouse
    15Childrelease fly49.3Childtouch butterfly
    15.4Animaleats fly apple-349.3Failinteraction butterfly wheat-1
    16.9Teacher confirms encouragement49.6Childrelease butterfly
    18.2Childtouch snake50Childtouch mouse
    18.4Failinteraction snake wheat-350.3Animaleats mouse wheat-1
    18.7Animaleats snake bird51Childrelease mouse
    19.6Animaleats snake bird51.1Animaleats mouse wheat-2
    20.5Childrelease snake51.4Robot proposes congrats
    20.6Failinteraction snake wheat-452.3Teacher confirms congrats
    20.6Robot proposes congrats52.9Childtouch snake
    20.9Childtouch eagle52.9Failinteraction snake wheat-3
    21.1Animaleats eagle bird53.2Childrelease snake
    22Animaleats eagle bird53.5Childtouch mouse
    22.4Childrelease eagle53.6Animaleats mouse wheat-3
    23.3Animaldead bird54.4Robot proposes congrats
    23.4Teacher selects instead mvc dragonfly - fly54.5Animaleats mouse wheat-4
    23.6Robottouch dragonfly55Childrelease mouse
    26.9Robotrelease dragonfly55.6Childtouch dragonfly
    27.7Childtouch fly56.1Teacher selects wait
    28Childrelease fly56.8Failinteraction dragonfly apple-1
    28.4Childtouch dragonfly57.3Childrelease dragonfly
    28.6Failinteraction dragonfly apple-157.5Failinteraction dragonfly apple-1
    29.1Childrelease dragonfly58.6Childtouch grasshopper
    29.4Failinteraction dragonfly apple-158.6Failinteraction grasshopper apple-1
    30.3Childtouch dragonfly58.8Childrelease undefined
    30.3Failinteraction dragonfly apple-159.1Childtouch dragonfly
    30.7Robot proposes encouragement59.1Failinteraction dragonfly apple-1
    31Failinteraction dragonfly apple-159.2Failinteraction grasshopper apple-1
    31.8Teacher selects wait59.9Failinteraction dragonfly apple-1

Supplementary Materials


    Fig. S1. Steps of the study.

    Table S1. Post hoc comparison of timing of actions for the supervised condition.

    Table S2. Post hoc comparison of timing of actions for the autonomous condition.

    Table S3. Exposure to learning units.

    Table S4. Game duration.

  • Supplementary Materials

    This PDF file includes:

    • Fig. S1. Steps of the study.
    • Table S1. Post hoc comparison of timing of actions for the supervised condition.
    • Table S2. Post hoc comparison of timing of actions for the autonomous condition.
    • Table S3. Exposure to learning units.
    • Table S4. Game duration.

    Download PDF

    Files in this Data Supplement:

Stay Connected to Science Robotics

Navigate This Article