Skip to content

Chiaraplizz/ARGO1M-What-can-a-cook

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

What can a cook in Italy teach a mechanic in India? Action Recognition Generalisation Over Scenarios and Locations

This is the official resource for the paper "What can a cook in Italy teach a mechanic in India? Action Recognition Generalisation Over Scenarios and Locations" including the dataset (ARGO1M) and the code.

BibTeX

If you use the ARGO1M dataset and/or our CIR method code, please cite:

@inproceedings{Plizzari2023, title={What can a cook in Italy teach a mechanic in India? Action Recognition Generalisation Over Scenarios and Locations}, author={Plizzari, Chiara and Perrett, Toby and Caputo, Barbara and Damen, Dima}, booktitle={ICCV2023}, year={2023}}

Requirements

We provide modified training scripts for CIR to replicate paper results. To install dependencies:

conda env create -f environment.yml

Dataset: ARGO1M

How to download ARGO1M

Our annotated clips making up ARGO1M, are curated from videos of the large-scale Ego4D dataset. Before using ARGO1M, you thus need to sign the EGO4D License Agreement. Here are the three steps to follow for downloading the dataset:

  1. Go to ego4ddataset.com to review and execute the EGO4D License Agreement, and you will be emailed a set of AWS access credentials when your license agreement is approved, which will take 48hrs.

  2. The datasets are hosted on Amazon S3 and require credentials to access. AWS CLI uses the credentials stored in the home directory file: ~/.aws/credentials. If you already have credentials configured then you can skip this step. If not, then:

  • Install the AWS CLI from: AWS CLI
  • Open a command line and type aws configure
  • Leave the default region blank, and enter your AWS access id and secret key when prompted.

The CLI requires python >= 3.8. Please install the prerequisites via python setup.py install (easyinstall) at the repo root, or via pip install -r requirements.txt.

  1. Download the dataset using the following command: python code/scripts/download_all.py --flag DEST_DIR, where flag is either ffcv or csv.

You can directly download our FFCV encodings for all ARGO1M splits as well as the CSV files described below.

CSV

We provide the .csv files for all the proposed splits.

Those contain the following entries:

  • uid: uid of the video clip;
  • scenario_idx: scenario label (index-scenario association in index_scenario.txt);
  • location_idx: location label (index-location association in index_location.txt);
  • label: action label (index-action association in index_verb.txt);
  • timestamp: starting timestamp;
  • timeframe: starting timeframe;
  • narration: narration;
  • action_start_feature_idx: starting feature index for SlowFast pre-extracted features;
  • action_end_feature_idx: ending feature index for SlowFast pre-extracted features.

FFCV

To speed up training, we used FFCV encodings of both training and test sets for each of the proposed splits.

We also provide the scripts for extracting them using the given CSV files. After downloading Ego4D SlowFast features, you can extract FFCVs by running:

python /scripts/dataset_ffcv_encode.py --config /configs/{config_file}.yaml --split {split_name}

Code structure

We designed the code in such a way that makes it easier to try your own methods and losses on top of it. Suppose you want to introduce a new module called MyModule in our pipeline.

  • You can define MyModule in models.py.

  • In the corresponding config.yaml, you can add to model_types MyModule, with corresponding attributes in model_names, model_lrs, model_use_train, model_use_eval and step.

  • In model_inputs, you can specify the input to MyModule, by prepending the model name that has provided the output, in the form {"arg":"other_model_name.output_name"}, e.g. {"input_logits":"mlp.logits"}.

  • You can do the same by adding your new loss MyLoss in loss_types, along with the corresponding loss_names, and by specifying the corresponding loss_inputs in the form {"arg":"other_model_name.output_name"}, e.g. {"logits":"mlp.logits"}.

Steps for training

The folder scripts contains code and bash scripts to reproduce the paper results. To re-create CIR results:

  1. Modify config internal paths to match the location of FFCV data.

  2. Run python run.py --config configs/config_run/run_CIR.yaml

License

All files in this repository are copyright by us and published under the Creative Commons Attribution-NonCommerial 4.0 International License, found here. This means that you must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. You may not use the material for commercial purposes.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages