LSTM Poetry with Text-to-Speech

For the final week’s project using ml5.js, I put together a LSTM text generator with a poetry model and text-to-speech generator.

Github: https://github.com/vulture-boy/lstmPoetry
(
there are a few extra models on the Github that you can access by modifying the code slightly; just change the model folder to load)

You can check it out here: https://vulture-boy.github.io/lstmPoetry/
[The text to speech seems to be giving the webhost some issues and only works some of the time. would recommend downloading it from the GitHub]

To accomplish this, I scraped poetry from a website and followed the tutorial listed on ml5: Training a LSTM

Scraping

webscraperchrome

I used Web Scraper in Chrome in order to get the text information I needed to train the machine learning process. I needed to create a text file containing the information I wanted the algorithm to learn from, but I didn’t want to go through the laborious process of manually collecting it from individual web pages or Google searches. Using a web scraper makes the task automated by the computer. The only information that is required is a ‘sitemap’ that you can put together using Web Scraper’s interface: pick out the html elements that designate which text, links and data of interest are located to describe to the scraper how to navigate the page and what to collect.

scrapepoems

After the process is complete (or if you decide to interrupt it), you can export a .csv containing the data collected by the Web Scraper process and copy the column(s) containing the desired data into a .txt file for the training process to use.

Training the Process

In order to prepare my computer for training, I had to install a few packages to my Windows 10 Powershell, namely Chocolatey, Python3, and a few python packages (pip, Tensorflow, virtual environment). It’s worth noting that in order to install these I needed to enable Remote Scripts: by default, Windows 10 prevents you from running scripts inside Powershell for safety purposes.

Installing Python3 (inc. Powershell setup)
Installing Tensorflow

capture

Once I had the packages installed, I ran the train.py file included in the training package repository on a .txt file collating all the text data I collected via web scraping. Each epoch denotes one full presentation of the data to the process and the time/batch section denotes how many seconds passed per process. The train_loss parameter indicates how accurate the process’ prediction was to the input data: the lower the value, the better the prediction. There are also several hyper-parameters that can be adjusted to improve the quality of the result and the time it takes to process (Google has a description of this here). I used the default settings for my first batch on the poetry:

  • with 15 minutes of scraped data (3500 iterations, poem paragraphs), it took about 15 minutes to process.
  • For a second batch, I collected about 30 minutes of data from a fanfiction website (227650 iterations, sentence and paragraph sizes) and I believe it took a little over 3 hours.
  • I adjusted the hyperparameters as recommended on the ml5 training instructions for 8mb of data on another 15 minute data set containing an entire novel (55000 iterations, 360 chapters) and instead chose to run the process on my laptop instead of my desktop computer. The average time/batch was ~7.5, larger than my desktop’s average of ~0.25 with default settings. This was also going to take approximately five days to complete, so I aborted the process. I tried again using default settings on my laptop: the iterations increased from 55000 to 178200 but the batch time was a respectable 0.115 on average.

scrape

The training file on completion creates a model folder, which can be substituted for any other LSTM model.

Text-to-Speech

One of the contributed libraries for p5.js is the p5.speech library. The library is easily integrated into existing p5.js projects and has comprehensive documentation on their website. For my LSTM generator, I created a voice object and a few extra sliders to control the voice’s pitch and playback speed as well as a playback button that read the output text. Now I can listen to beautiful machine-rendered poetry!

Here’s a sample output:

The sky was blue and horry The light a bold with a more in the garden, Who heard on the moon and song the down the rasson’t good the mind her beast be oft to smell on the doss of the must the place But the see though cold to the pain With sleep the got of the brown be brain. I was the men in the like the turned and so was the chinder from the soul the Beated and seen, Some in the dome they love me fall, to year that the more the mountent to smocties, A pet the seam me and dream of the sease ends of the bry sings.

Leave a Reply