https://github.com/univr-VIPS/Shrec22
Tip revision: e2bbcc0280df2de07811e7154f5c89624b0e19b6 authored by ArielCaputo on 27 July 2022, 17:17:07 UTC
replaced old files
replaced old files
Tip revision: e2bbcc0
index.html
<!DOCTYPE html>
<html lang="en">
<title>SHREC 2022 UNIVR-VIPS</title>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="https://www.w3schools.com/w3css/4/w3.css">
<link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Poppins">
<style>
body,h1,h2,h3,h4,h5 {font-family: "Poppins", sans-serif}
body {font-size:16px;}
.w3-half img{margin-bottom:-6px;margin-top:16px;opacity:0.8;cursor:pointer}
.w3-half img:hover{opacity:1}
</style>
<body>
<!-- Sidebar/menu -->
<nav class="w3-sidebar w3-red w3-collapse w3-top w3-large w3-padding" style="z-index:3;width:300px;font-weight:bold;" id="mySidebar"><br>
<a href="javascript:void(0)" onclick="w3_close()" class="w3-button w3-hide-large w3-display-topleft" style="width:100%;font-size:22px">Close Menu</a>
<div class="w3-container">
<img src="vips.png">
</div>
<div class="w3-bar-block">
<a href="#" onclick="w3_close()" class="w3-bar-item w3-button w3-hover-white">Home</a>
<a href="#motivation" onclick="w3_close()" class="w3-bar-item w3-button w3-hover-white">Motivation</a>
<a href="#dataset" onclick="w3_close()" class="w3-bar-item w3-button w3-hover-white">Dataset</a>
<a href="#task" onclick="w3_close()" class="w3-bar-item w3-button w3-hover-white">Task and evaluation </a>
<a href="#reg" onclick="w3_close()" class="w3-bar-item w3-button w3-hover-white">Registration-Submission</a>
<a href="#dates" onclick="w3_close()" class="w3-bar-item w3-button w3-hover-white">Important dates</a>
<a href="#organizers" onclick="w3_close()" class="w3-bar-item w3-button w3-hover-white">Organizers</a>
</div>
</nav>
<!-- Top menu on small screens -->
<header class="w3-container w3-top w3-hide-large w3-red w3-xlarge w3-padding">
<a href="javascript:void(0)" class="w3-button w3-red w3-margin-right" onclick="w3_open()">☰</a>
<span>UNIVR-VIPS lab</span>
</header>
<!-- Overlay effect when opening sidebar on small screens -->
<div class="w3-overlay w3-hide-large" onclick="w3_close()" style="cursor:pointer" title="close side menu" id="myOverlay"></div>
<!-- !PAGE CONTENT! -->
<div class="w3-main" style="margin-left:340px;margin-right:40px">
<!-- Header -->
<div class="w3-container" style="margin-top:80px" id="showcase">
<h1 class="w3-xxxlarge"><b>SHREC 2022 track on <br/>
online detection of heterogeneous gestures </b></h1>
<hr style="width:50px;border:5px solid red" class="w3-round">
</div>
<!-- Photo grid (modal) -->
<div class="w3-row-padding">
<div class="w3-half">
<img src="gestures.png" style="width:100%" onclick="onClick(this)" alt="gestures">
</div>
<div class="w3-half">
<img src="hololens2.jpg" style="width:100%" onclick="onClick(this)" alt="holo2">
</div>
</div>
<!-- Modal for full size images on click-->
<div id="modal01" class="w3-modal w3-black" style="padding-top:0" onclick="this.style.display='none'">
<span class="w3-button w3-black w3-xxlarge w3-display-topright">×</span>
<div class="w3-modal-content w3-animate-zoom w3-center w3-transparent w3-padding-64">
<img id="img01" class="w3-image">
<p id="caption"></p>
</div>
</div>
<!-- Motivation -->
<div class="w3-container" id="motivation" style="margin-top:75px">
<h1 class="w3-xxxlarge w3-text-red"><b>Motivation</b></h1>
<hr style="width:50px;border:5px solid red" class="w3-round">
<p> Following the interesting results of the previous edition <a href="https://univr-vips.github.io/Shrec21/"> (Shrec'21 Track on online gesture recognition in the wild)</a>, we organize a novel edition of the contest, still aimed at benchmarking methods to detect and classify gestures from 3D trajectories of the fingers' joints (requiring 3D geometry processing and being therefore interesting for the SHREC community).
As the most recent generation of Head-Mounted Displays for mixed reality (Hololens2, Oculus Quest, VIVE) provide accurate finger tracking, gestures can be exploited to create novel interaction paradigms for immersive VR and XR.
While we tried to maintain continuity with respect to the 2021 edition, we made some important updates to the gestures' dictionary and to the recognition task and we will provide a novel dataset.
In the novel dictionary, we removed gesture classes creating ambiguities (POINTING, EXPAND) and the evaluation will not only consider the correct gesture detection, but also the latency in the recognition. The frame rate of the acquisition is lower and not perfectly regular, and the timestamps of the captured frames are provided with the data.
</p>
</div>
<!-- Dataset -->
<div class="w3-container" id="dataset" style="margin-top:75px">
<h1 class="w3-xxxlarge w3-text-red"><b>Dataset</b></h1>
<hr style="width:50px;border:5px solid red" class="w3-round">
<p>The dataset includes 288 sequences including a variable number of gestures, divided in a training set, with provided annotations, and a test set where the gestures have to be found, in accord to a specific protocol detailed in the following.
</p>
<b><a href="training_set.zip"> Click here to download the training data (with annotations)</a><br/></b>
<b><a href="test_set.zip"> Click here to download the test data (without annotations)</a><br/> </b>
<p>Fingers data are captured with a Hololens2 device simulating mixed reality interaction. Time sequences are saved as text files where each row represents the data of a specific time frame with the coordinates of 26 joints. Each joint is therefore characterized by 3 floats (x,y,x position), therefore each row is encoded as:<br/>
Frame Index(integer); Time_stamp(float); Joint1_x (float); Joint1_y; Joint1_z; Joint2_x; Joint2_y; Joint2_z; .....
<br/>
The joints' sequence is described <a href="table.html"> here (click for description) </a> and can also be find in the MRTK documentation <a href="https://docs.microsoft.com/en-us/dotnet/api/microsoft.mixedreality.toolkit.utilities.trackedhandjoint?view=mixed-reality-toolkit-unity-2020-dotnet-2.7.0"> available at this link </a>
</p>
<p> The gesture dictionary is similar to the SHREC2021 one, but with a few changes. The current dictionary is composed by 16 gestures divided in 4 categories: static characterized by a pose kept fixed (for at least 0.5 sec), dynamic, characterized by a single trajectory of the hand, fine-grained dynamics, characterized by fingers'articulation, dynamic-periodic, where the same fingers'motion pattern is repeated more times. The description of the gestures <a href="table2.html"> is reported in this table</a>.
<div class="w3-half">
<dl>
<dt>Static Gestures</dt>
<dd>- ONE</dd>
<dd>- TWO</dd>
<dd>- THREE</dd>
<dd>- FOUR</dd>
<dd>- OK</dd>
<dd>- MENU</dd>
<br/>
<dt> Dynamic Gestures</dt>
<dd>- LEFT</dd>
<dd>- RIGHT</dd>
<dd>- CIRCLE</dd>
<dd>- V</dd>
<dd>- CROSS</dd>
<br/>
<dt> Fine-grained dynamic Gestures</dt>
<dd>- GRAB</dd>
<dd>- PINCH</dd>
<br/>
<dt> Dynamic-periodic Gestures</dt>
<dd>- DENY</dd>
<dd>- WAVE </dd>
<dd>- KNOB</dd>
</dl>
</div>
<div class="w3-half">
<img src="dictionary.png" width=80%>
</div>
</div>
<p>
Training sequences are distributed together with an annotation file, <b> annotations.txt </b>. It is a text file with a line corresponding to each sequence, and where two integer numbers are associated to each gesture: gesture execution start (frame index - annotated in post-processing), gesture execution end (frame index - annotated in post-processing).
For example, if the first sequence features a PINCH and a ONE and the second a THREE, a LEFT and a KNOB, the first two lines of the annotation file may be like the following ones:
<dl>
<dd>1;PINCH;10;30;ONE;85;13;</dd>
<dd>2;THREE;18;75;38;LEFT;111;183;131;KNOB;222;298;318;</dd>
<dd> .... </dd>
</dl>
We also provide an additional file (train_acquisition programming.txt) that records the time annotation of when the command to perform gestures is given to the performing subjects. It features a line per sequence with a sequence of time values followed by gesture labels. In the practical setting, at each time value, the Hololens app worn by the user started the reproduction of a recorded voice instructing to start the following command. These values may be useful for research purposes as they may give the possibility to analyze reaction times and preparatory movements, and we release them for the participants, but they will not be used in the evaluation. For the evaluation of the detection performances, we will rely on the frame labels annotated in post-processing.
</p>
<!-- Task and evaluation -->
<div class="w3-container" id="task" style="margin-top:75px">
<h1 class="w3-xxxlarge w3-text-red"><b>Task and evaluation </b></h1>
<hr style="width:50px;border:5px solid red" class="w3-round">
<p>
Participants should develop methods to detect and classify gestures of the dictionary in the test sequences based on the annotated examples in the training sequences. The developed methods should process data simulating an online detection scenario or, in other words, detect and classify gestures progressively by processing trajectories from beginning to end. The format required for the submission should consist of a text file with a row corresponding to each sequence with the following information: sequence number (the number in the filename with the sequence in the training/test) and, for each identified gesture, a text string and three numbers separated by a semicolon. The string must be the gesture label (the string in the dictionary). The three numbers must be the frame number (first column in the sequence file) of the detected gesture start, the frame number (first column in the sequence file) of the detected gesture end (first column in the sequence file), the frame number (first column in the sequence file) of the last frame used by the algorithm for the gesture start prediction. For example, if the algorithm in the first sequence detects a PINCH and a ONE and in the second a THREE, a LEFT, and a KNOB, the first two lines of the results file will be like the following ones:
<dl>
<dd>1;PINCH;12;54;32;ONE;82;138;103;</dd>
<dd>2;THREE;18;75;38;LEFT;111;183;131;KNOB;222;298;318;</dd>
</dl>
<b> IMPORTANT: the third value for each gesture should be consistent with the algorithm description provided with the submission. </b>
<br/>
The evaluation is based on the number of correct gestures detected, the false positive rate, and the detection latency. The Jaccard Index of the continuous annotation of the frame sequence will be also evaluated. A gesture will be considered correctly detected it is continuously detected with the same label in a time window with an intersection with the ground truth annotation larger of more than 50% of its length. The detection latency will be estimated as the difference between the actual gesture start and the reported timestamp of the last frame used for the prediction.
<div class="w3-container" id="reg" style="margin-top:75px">
<h1 class="w3-xxxlarge w3-text-red"><b>Registration and submission guidelines </b></h1>
<hr style="width:50px;border:5px solid red" class="w3-round">
<p>
Participants should register by sending an e-mail to <b> andrea.giachetti(at)univr.it </b> within the deadline reported below.
Each registered group should then complete the submission within the <b> strict</b> deadline of February 28, by sending an email with attached files or including links for downloading the files.
The submission must include up to three text files with results formatted as described above. Results can be obtained with different algorithms or parameters settings. Submission should preferably include executable code and instructions to reproduce the results. Each group must also submit within the same strict deadline a 1-page latex description of the method used, including all the important implementation details and reporting the computation time for a single frame classification. Feedback on the evaluation will be provided to the participants as soon as possible to prepare the draft paper with the outcomes.
</p>
</div>
<!-- Dates -->
<div class="w3-container" id="task" style="margin-top:75px">
<h1 class="w3-xxxlarge w3-text-red"><b>IMPORTANT DATES</b></h1>
<hr style="width:50px;border:5px solid red" class="w3-round">
<p>
<ul>
<li> <b> 28/1/2022</b> Dataset released </li>
<li> <b> 11/2/2022</b> Registration deadline </li>
<li> <b> 28/2/2022</b> Results'submission deadline </li>
<li> <b> 15/3/2022</b> Draft paper with the track outcomes submitted to Computer and Graphics</li>
</p>
</div>
<!-- Contact -->
<div class="w3-container" id="contact" style="margin-top:75px">
<h1 class="w3-xxxlarge w3-text-red"><b>Organizers</b></h1>
<hr style="width:50px;border:5px solid red" class="w3-round">
<p>Organizing team</p>
<ul>
<li> Marco Emporio - VIPS lab, University of Verona</li>
<li> Anton Pirtac - VIPS lab, University of Verona</li>
<li> Ariel Caputo - VIPS lab, University of Verona</li>
<li> Marco Cristani - VIPS lab, University of Verona</li>
<li> Andrea Giachetti - VIPS lab, University of Verona</li>
</ul>
</div>
<!-- End page content -->
</div>
<!-- W3.CSS Container -->
<div class="w3-light-grey w3-container w3-padding-32" style="margin-top:75px;padding-right:58px"><p class="w3-right">Powered by <a href="https://www.w3schools.com/w3css/default.asp" title="W3.CSS" target="_blank" class="w3-hover-opacity">w3.css</a></p></div>
<script>
// Script to open and close sidebar
function w3_open() {
document.getElementById("mySidebar").style.display = "block";
document.getElementById("myOverlay").style.display = "block";
}
function w3_close() {
document.getElementById("mySidebar").style.display = "none";
document.getElementById("myOverlay").style.display = "none";
}
// Modal Image Gallery
function onClick(element) {
document.getElementById("img01").src = element.src;
document.getElementById("modal01").style.display = "block";
var captionText = document.getElementById("caption");
captionText.innerHTML = element.alt;
}
</script>
</body>
</html>