Apprentice Scheduler

Technology #18626

Questions about this technology? Ask a Technology Manager

Download Printable PDF

Professor Julie Shah
Department of Aeronautics and Astronautics
External Link (
Matthew Gombolay
Department of Aeronautics and Astronautics
Managed By
Daniel Dardani
MIT Technology Licensing Officer
Patent Protection

Human-Machine Collaborative Optimization Via Apprenticeship Scheduling

US Patent Pending
Apprenticeship scheduling: Learning to schedule from human experts
Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI), July 2016


This invention describes a system for allocating resources according to specified time and labor constraints. Nearly every profession has a need for optimized resource allocation, making this technology applicable for a wide variety of fields including healthcare, manufacturing and military engagements.

Problem Addressed

Coordinating agents to complete a set of tasks with temporal and resource constraints is a challenging problem requiring human domain experts to employ knowledge paradigms learned through years of apprenticeship. A process to manually codify this domain knowledge within a computational framework is necessary to scale beyond the “single-expert, single-trainee” apprenticeship model. However, human domain experts often have difficulty describing their decision-making processes, causing the codification of this knowledge to become laborious. The Inventors have developed a new approach to capture domain-expert heuristics through a pairwise ranking formulation that accurately learns multifaceted heuristics on both synthetic and real world data sets.


This technique, called “apprenticeship scheduling,” captures this domain knowledge in the form of a scheduling policy. Its objective is to learn scheduling policies through expert demonstration and validate that schedules produced by these policies are of comparable quality to those generated by human or synthetic experts. This approach efficiently utilizes domain-expert demonstrations without the need to train within an environment emulator. Rather than explicitly modeling a reward function and relying upon dynamic programming or constraint solvers, which become computationally unfeasible for large-scale problems of interest, they use action-driven learning to extract the strategies of domain experts in order to efficiently schedule tasks.

This approach uses pairwise comparisons between the actions taken (e.g. schedule agent a to complete task Ti  at time t) to learn the relevant model parameters and scheduling policies demonstrated by the training examples. The approach is validated using both a synthetic data set of solutions for a variety of scheduling problems and a real-world data set of demonstrations from human experts solving a variant of the weapon-to-target assignment problem. The synthetic and real-world problem domains used to empirically validate the approach represent two of the most challenging classes within a well-established class taxonomy.


  • Approach allows human decision-making heuristics to be applied to problems that expand beyond a one-on-one apprenticeship model
  • Model-free approach does not require enumerating or iterating through a large state-space
  • Approach can be trained to solve scheduling problems on both synthetic and real-world data sets