quantling
diff --git a/‎.gitignore‎
Lines changed: 3 additions & 2 deletions b/‎.gitignore‎
Lines changed: 3 additions & 2 deletions
diff --git a/‎.readthedocs.yml‎
Lines changed: 1 addition & 1 deletion b/‎.readthedocs.yml‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎.zenodo.json‎
Lines changed: 4 additions & 1 deletion b/‎.zenodo.json‎
Lines changed: 4 additions & 1 deletion
diff --git a/‎README.rst‎
Lines changed: 98 additions & 3 deletions b/‎README.rst‎
Lines changed: 98 additions & 3 deletions
@@ -2,8 +2,9 @@ __pycache__/
 bin/targetoptimizer
 vtl_corpus1.0/
 docs/_build/
-create_vtl_corpus/resources/
+create_vtl_corpus/resources/*bin
 create_vtl_corpus/manual_tests/
 tests/clips/
 *.wav
-
+*.swp
+poetry.lock
@@ -9,7 +9,7 @@ version: 2
 build:
   os: ubuntu-20.04
   tools:
-    python: "3.9"
+    python: "3.12"
 
 # Build documentation in the docs/ directory with Sphinx
 sphinx:
 
@@ -1,7 +1,7 @@
 {
     "title": "create_vtl_corpus: Synthesizing a speech corpus with VocalTractLab",
     "license": {
-      "id": "MIT"
+      "id": "GLP-3"
     },
     "creators": [
         {
@@ -11,6 +11,9 @@
         {
             "name": "Niels Stehwien"
         },
+        {
+            "name": "Schmidt Valentin"
+        },
         {
             "name": "Yingming Gao"
         }
 
@@ -7,7 +7,12 @@ Readme
 
 This package supplies the necessary functions in order to synthesize speech
 from a phonemic transcription. Furthermore, it defines helpers to improve the
-result if more information as the pitch contour is available.
+result if more information as the pitch contour is available. It is especially useful when working with 
+the `PAULE <https://github.com/quantling/paule>`__ framework.
+
+Currently the package supports the following languages:
+   - German
+   - English
 
 Version 2.0.0 and later
 -----------------------
@@ -30,12 +35,102 @@ functions from top to bottom. The functions are supplied by the other files.
    Please use the VTL api directly.
 
 
+Minimal Example
+===============
+Given a german Corpus with the following structure which is what the `Mozilla Common Voice project <https://commonvoice.mozilla.org>`__ provides:
+
+ .. code:: bash
+
+    corpus/
+    ├── validated.tsv         # a file where the transcripts are stored
+    ├── clips/
+    │   └── *.mp3             # audio files (mp3)
+    └── files_not_relevant_to_this_project
+
+If you run the following command the package will align the audio files for you, and then create a pandas DataFrame with the synthesized audio and other information useful for the PAULE model,
+but only for the first 100 words that occur 4 times or more. Since you use multiprocessing, no melspectrograms are generated.:
+.. code:: bash
+
+    python -m create_vtl_corpus.create_corpus --corpus CORPUS --language de --needs_aligner --use_mp --min_word_count 4 --word_amount 100 --save_df_name SAVE_DF_NAME
+
+The end product should look someting like this
+
+.. code:: bash
+
+   corpus/
+   ├── validated.tsv          # a file where the transcripts are stored
+   ├── clips/
+   │   ├── *.mp3              # mp3 files
+   │   └── *.lab              # lab files
+   ├── clips_validated/
+   │   ├── *.mp3              # validated mp3 files
+   │   └── *.lab              # validated lab files
+   ├── clips_aligned/
+   │   └── *.TextGrid         # aligned TextGrid files
+   ├── corpus_as_df.pkl       # a pandas DataFrame with the information
+   └── files_not_relevant_to_this_project
+
+The DataFrame contains the following columns
+
+.. list-table:: Dataframe Labels
+   :header-rows: 1
+
+   * - Column Name
+     - Description
+   * - file_name
+     - Name of the clip
+   * - label
+     - The spoken word as it is in the aligned textgrid
+   * - lexical_word
+     - The word as it is in the dictionary
+   * - word_position
+     - The position of the word in the sentence
+   * - sentence
+     - The sentence the word is part of
+   * - wav_recording
+     - Spliced out audio as mono audio signal
+   * - sr_recording
+     - Sampling rate of the recording
+   * - sr_synthesized
+     - Sampling rates synthesized
+   * - sampa_phones
+     - The SAMPA(like) phonemes of the word
+   * - mfa_phones
+     - The phonemes as outputted by the aligner
+   * - phone_durations_lists
+     - The duration of each phone in the word as list
+   * - cp_norm
+     - Normalized CP-trajectories
+   * - vector
+     - Embedding vector of the word, based on FastText Embeddings
+   * - client_id
+     - ID of the client
+
+
 Copyright
 =========
-As the VocalTractLabAPI.so and the JD2.speaker is GPL v3 the rest of the code
-here is GPL as well.  If the code is not dependent on VTL anymore you can use
+As the VocalTractLabAPI.so and the JD2.speaker is under GPL v3 the rest of the code
+here is GPL  under as well.  If the code is not dependent on VTL anymore you can use
 it under MIT license.
 
+
+Citing 
+=======
+If you use this code for your research, please cite the following thesis:
+
+Konstantin Sering. Predictive articulatory speech synthesis utilizing lexical embeddings (PAULE). PhD thesis, Universität Tübingen, 2023.
+
+.. code:: bibtex
+   
+      @phdthesis{sering2023paule,
+         title={Predictive articulatory speech synthesis utilizing lexical embeddings (PAULE)},
+         author={Sering, Konstantin},
+         year={2023},
+         school={Universität Tübingen}
+      }
+
+   
+
 Acknowledgments
 ===============
 This research was supported by an ERC advanced Grant (no. 742545), by the
Original file line number	Diff line number	Diff line change
`@@ -1,7 +1,7 @@`
`1`	`1`	`{`
`2`	`2`	`"title": "create_vtl_corpus: Synthesizing a speech corpus with VocalTractLab",`
`3`	`3`	`"license": {`
`4`		`- "id": "MIT"`
	`4`	`+ "id": "GLP-3"`
`5`	`5`	`},`
`6`	`6`	`"creators": [`
`7`	`7`	`{`
`@@ -11,6 +11,9 @@`
`11`	`11`	`{`
`12`	`12`	`"name": "Niels Stehwien"`
`13`	`13`	`},`
	`14`	`+ {`
	`15`	`+ "name": "Schmidt Valentin"`
	`16`	`+ },`
`14`	`17`	`{`
`15`	`18`	`"name": "Yingming Gao"`
`16`	`19`	`}`