Fig. 2 Setup used in the study. A child interacts with the robot tutor with a large touchscreen sitting between them, displaying the learning activity; a human teacher provides guidance to the robot through a tablet and monitors the robot’s learning. Although the picture depicts an early laboratory pilot, the main study was conducted on actual school premises.
Fig. 3 Comparison of policy between the supervised and autonomous robot. (A) Comparison of the number of actions of each type executed by the robot in the autonomous and supervised conditions. Each point represents how often the robot executed an action with a child (n = 25 per condition). (B) Timing between each action and the last eating event (due to their low or null number of execution, the actions “move to” and “move away” were not analyzed). Each point represents one execution of an action.
Fig. 4 Comparison of children’s behavior between the three conditions. (A) Number of different eating interactions produced by the children (corresponding to the exposure to learning units) for the four rounds of the game, for the three conditions. (B) Interaction time for the four rounds of the game for the three conditions. The dashed red line represents 2.25 min, the time at which unfed animals died without intervention, leading to an end of the game if the child did not feed animals enough.
Fig. 5 Summary of the action selection process in the supervised condition. Child number 1 corresponds to the beginning of the training; child number 25 corresponds to the end of the training. The “teacher-initiated actions” label represents each time the teacher manually selected an action not proposed by the robot.
Fig. 6 Simplified schematics of the architecture used to control the robot. A game (1) runs on a touchscreen between the child and the robot. (2) analyzes the state of the game using inputs from the game and the camera. (3) is an interface running on a tablet and used by the teacher to control and teach the robot. (4) communicates actions between the interface (3) and the learner (7). (5) translates teacher’s actions into robotic commands used by (6) and (8) and executed by the robot (9). Last, (7) is the learning algorithm, which defines a policy based on the state perceived and the previous actions selected by the teacher, their substates, and their feedback on propositions. The different nodes communicate using Robot Operating System (ROS).
Fig. 7 Flowchart of the action selection. Mixed-initiative control is achieved via a combination of actions selected by the teacher, propositions from the robot, and corrections of propositions by the teacher. The algorithm uses instances x, corresponding to a tuple: action, a; substate, s′; and reward, r. s′ is defined on S′ with S ′ ⊂ S and N′ the set of the indexes of the n′ selected dimensions of s′.
- Table 1 Example of events during the first minute of the first round of the interaction with the 23rd child in the supervised condition.
Events beginning with “robot” represent suggestions from the robot; events beginning with “teacher” are the reactions from the teacher. “mvc” is the abbreviation of the move close action, and times are provided in seconds. Words in italics refer to items on the screen and number (if applicable) of the specific item interacting.
Time Event Time Event 4.1 Childtouch frog 32.5 Childrelease dragonfly 4.3 Failinteraction frog wheat-3 34.4 Childtouch wolf 4.9 Animaleats frog fly 34.7 Robot proposes remind rules 5.8 Childrelease frog 35 Animaleats wolf mouse 6.6 Robot proposes congrats 36 Teacher selects wait 7.6 Childtouch fly 36 Animaleats wolf mouse 7.6 Teacher selects wait 37.2 Childrelease wolf 8 Animaleats fly apple-4 37.7 Childtouch grasshopper 8.3 Childrelease fly 38.3 Robot proposes congrats 9.1 Teacher selects congrats 42.1 Failinteraction grasshopper apple-1 9.1 Childtouch frog 42.7 Childrelease grasshopper 10.3 Childrelease frog 42.7 Failinteraction grasshopper apple-1 10.8 Childtouch frog 44.4 Teacher selects instead mvc mouse-wheat-1 11.2 Animaleats frog fly 44.6 Robottouch mouse 12.4 Failinteraction frog apple-2 44.7 Childtouch butterfly 12.5 Animaleats frog fly 45.1 Failinteraction butterfly wheat-2 13.2 Childrelease frog 45.6 Childrelease wheat-1 14.2 Childtouch fly 45.6 Robotrelease mouse 14.5 Animaleats fly apple-2 45.7 Robottouch mouse 14.6 Robot proposes encouragement 48.9 Robotrelease mouse 15 Childrelease fly 49.3 Childtouch butterfly 15.4 Animaleats fly apple-3 49.3 Failinteraction butterfly wheat-1 16.9 Teacher confirms encouragement 49.6 Childrelease butterfly 18.2 Childtouch snake 50 Childtouch mouse 18.4 Failinteraction snake wheat-3 50.3 Animaleats mouse wheat-1 18.7 Animaleats snake bird 51 Childrelease mouse 19.6 Animaleats snake bird 51.1 Animaleats mouse wheat-2 20.5 Childrelease snake 51.4 Robot proposes congrats 20.6 Failinteraction snake wheat-4 52.3 Teacher confirms congrats 20.6 Robot proposes congrats 52.9 Childtouch snake 20.9 Childtouch eagle 52.9 Failinteraction snake wheat-3 21.1 Animaleats eagle bird 53.2 Childrelease snake 22 Animaleats eagle bird 53.5 Childtouch mouse 22.4 Childrelease eagle 53.6 Animaleats mouse wheat-3 23.3 Animaldead bird 54.4 Robot proposes congrats 23.4 Teacher selects instead mvc dragonfly - fly 54.5 Animaleats mouse wheat-4 23.6 Robottouch dragonfly 55 Childrelease mouse 26.9 Robotrelease dragonfly 55.6 Childtouch dragonfly 27.7 Childtouch fly 56.1 Teacher selects wait 28 Childrelease fly 56.8 Failinteraction dragonfly apple-1 28.4 Childtouch dragonfly 57.3 Childrelease dragonfly 28.6 Failinteraction dragonfly apple-1 57.5 Failinteraction dragonfly apple-1 29.1 Childrelease dragonfly 58.6 Childtouch grasshopper 29.4 Failinteraction dragonfly apple-1 58.6 Failinteraction grasshopper apple-1 30.3 Childtouch dragonfly 58.8 Childrelease undefined 30.3 Failinteraction dragonfly apple-1 59.1 Childtouch dragonfly 30.7 Robot proposes encouragement 59.1 Failinteraction dragonfly apple-1 31 Failinteraction dragonfly apple-1 59.2 Failinteraction grasshopper apple-1 31.8 Teacher selects wait 59.9 Failinteraction dragonfly apple-1
Supplementary Materials
robotics.sciencemag.org/cgi/content/full/4/35/eaat1186/DC1
Fig. S1. Steps of the study.
Table S1. Post hoc comparison of timing of actions for the supervised condition.
Table S2. Post hoc comparison of timing of actions for the autonomous condition.
Table S3. Exposure to learning units.
Table S4. Game duration.
Additional Files
Supplementary Materials
This PDF file includes:
- Fig. S1. Steps of the study.
- Table S1. Post hoc comparison of timing of actions for the supervised condition.
- Table S2. Post hoc comparison of timing of actions for the autonomous condition.
- Table S3. Exposure to learning units.
- Table S4. Game duration.
Files in this Data Supplement: