ari
: ari_stitch
The main workhorse of ari
is the ari_stitch
function. This function requires an ordered set of images and an ordered set of audio objects, either paths to wav
files or tuneR
Wave objects, that correspond to each image. The ari_stitch
function sequentially “stitches” each image in the video for the duration of its corresponding audio object using ffmpeg
. In order to use ari
, one must have an ffmpeg
installation to combine the audio and images. Other packages such as animation
have a similar requirement. Moreover, on shinyapps.io, a dependency on the animation
package will trigger an installation of ffmpeg
so ari
can be used on shinyapps.io. In the example below, 2 images (packaged with ari
) are overlaid withe white noise for demonstration. This example also allows users to check if the output of ffmpeg
works with a desired video player.
#> [1] TRUE
The output is a logical indicator, but additional attributes are available, such as the path of the output file:
if (ari::have_ffmpeg_exec()) {
print(attributes(result)$outfile)
}
#> [1] "file139a22df1563c.mp4"
The video for this output can be seen at https://youtu.be/3kgaYf-EV90.
In ariExtra
, you
#> $output_file
#> [1] "/var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T//RtmpeDKFCU/file139a27f214c8c.md"
#>
#> $original_images
#> [1] "/Users/johnmuschelli/Library/R/4.0/library/ari/test/mab1.png"
#> [2] "/Users/johnmuschelli/Library/R/4.0/library/ari/test/mab2.png"
#>
#> $images
#> [1] "/private/var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T/RtmpeDKFCU/file139a27f214c8c_files/slide_00001.png"
#> [2] "/private/var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T/RtmpeDKFCU/file139a27f214c8c_files/slide_00002.png"
#>
#> $script
#> [1] "hey" "ho"
#>
#> $use_knitr
#> [1] FALSE
#> [1] "---"
#> [2] "output:"
#> [3] " ariExtra::ari_document:"
#> [4] " verbose: yes"
#> [5] "---"
#> [6] ""
#> [7] ""
#> [8] "----------"
#> [9] ""
#> [10] "<!--hey-->"
#> [11] "![](/Users/johnmuschelli/Library/R/4.0/library/ari/test/mab1.png)"
#> [12] ""
#> [13] ""
#> [14] "----------"
#> [15] ""
#> [16] "<!--ho-->"
#> [17] "![](/Users/johnmuschelli/Library/R/4.0/library/ari/test/mab2.png)"
#> [18] ""
The above example uses tuneR::noise()
to generate audio and to show that any audio object can be used with ari
. In most cases however, ari
is most useful when combined with synthesizing audio using a text-to-speech system. Though one can generate the spoken audio in many ways, such as fitting a custom deep learning model, we will focus on using the aforementioned services (e.g. Amazon Polly) as they have straightforward public web APIs. One obstacle in using such services is that users must go through steps to provide authentication, whereas most of these APIs and the associated R packages do not allow for interactive authentication such as OAuth.
The text2speech
package provides a unified interface to these 3 text-to-speech services, and we will focus on Amazon Polly and its authentication requirements. Polly is authenticated using the aws.signature
package. The aws.signature
documentation provides options and steps to create the relevant credentials; we have also provided an additional tutorial. Essentially, the user must sign up for the service and retrieve public and private API keys and put them into their R profile or other areas accessible to R. Running text2speech::tts_auth(service = "amazon")
will indicate if authentication was successful (if using a different service, change the service
argument). NB: The APIs are generally paid services, but many have free tiers or limits, such as Amazon Polly’s free tier for the first year (https://aws.amazon.com/polly/pricing/).
ari_spin
After Polly has been authenticated, videos can be created using the ari_spin
function with an ordered set of images and a corresponding ordered set of text strings. This text is the “script” that is spoken over the images to create the output video. The number of elements in the text needs to be equal to the number of images.
Many R users have experience creating slide decks with R Markdown, for example using the rmarkdown
or xaringan
packages. In ari
, the HTML slides are rendered using webshot
and the script is located in HTML comments (i.e. between <!--
and -->
). For example, in the file ari_comments.Rmd
included in ari
, which is an ioslides
type of R Markdown slide deck, we have the last slide:
x = readLines(ari_example("ari_comments.Rmd"))
tail(x[ x != ""], 4)
#> [1] "## Conclusion"
#> [2] "<!--"
#> [3] "Thank you for watching this video and good luck using Ari!"
#> [4] "-->"
so that the first words spoken on that slide are "Thank you"
. This setup allows for one plain text, version-controllable, integrated document that can reproducibly generate a video. We believe these features allow creators to make agile videos, that can easily be updated with new material or changed when errors or typos are found. Moreover, this framework provides an opportunity to translate videos into multiple languages, we will discuss in the future directions.
# Create a video from an R Markdown file with comments and slides
result = ariExtra::rmd_to_ari(
ari::ari_example("ari_comments.Rmd"),
capture_method = "iterative", open = FALSE)
The output video is located at https://youtu.be/rv9fg_qsqc0. In our experience with several users we have found that some HTML slides take more or less time to render when using webshot
; for example they may be tinted with gray because they are in the middle of a slide transition when the image of the slide is captured. Therefore we provide the delay
argument in ari_narrate
which is passed to webshot
. This can resolve these issues by allowing more time for the page to fully render, however this means it may take for more time to create each video. We also provide the argument capture_method
to allow for finely-tuned control of webshot
. When capture_method = "vectorized"
, webshot
is run on the entire slide deck in a faster process, however we have experienced slide rendering issues with this setting depending on the configuration of an individual’s computer. However when capture_method = "iterative"
, each slide is rendered individually in webshot
, which solves many rendering issues, however it causes videos to be rendered more slowly.
In the future, other HTML headless rendering engines (webshot
uses PhantomJS
) may be used if they achieve better performance, but we have found webshot
to work well in most of our applications.
With respect to accessibility, ari
encourages video creators to type out a script by design. This provides an effortless source of subtitles for people with hearing loss rather than relying on other services, such as YouTube, to provide speech-to-text subtitles. When using ari_spin
, if the subtitles
argument is TRUE
, then an SRT file for subtitles will be created with the video.
One issue with synthesis of technical information is that changes to the script are required for Amazon Polly or other services to provide a correct pronunciation. For example, if you want the service to say “RStudio” or “ggplot2
”, the phrases “R Studio” or “g g plot 2” must be written exactly that way in the script. These phrases will then appear in an SRT subtitle file, which may be confusing to a viewer. Thus, some post-processing of the SRT file may be needed.
In order to create a video from a Google Slide deck or PowerPoint presentation, the slides should be converted to a set of images. We recommend using the PNG format for these images. In order to get the script for the video, we suggest putting the script for each slide in the speaker notes section of that slide. Several of the following features for video generation are in our package ariExtra
(https://github.com/jhudsl/ariExtra). The speaker notes of slides can be extracted using rgoogleslides
for Google Slides via the API or using readOffice
/officer
to read from PowerPoint documents. Google Slides can be downloaded as a PDF and converted to PNGs using the pdftools
package. The ariExtra
package also has a pptx_notes
function for reading PowerPoint notes. Converting PowerPoint files to PDF can be done using LibreOffice and the docxtractr
package which contains the necessary wrapper functions.
To demonstrate this, we use an example PowerPoint is located on Figshare (https://figshare.com/articles/presentation/Example_PowerPoint_for_ari/8865230). We can convert the PowerPoint to PDF, then to a set of PNG images, then extract the speaker notes.
have_libreoffice = function() {
x = try({docxtractr:::lo_assert()}, silent = TRUE)
!inherits(x, "try-error")
}
if (have_libreoffice()) {
pptx = tempfile(fileext = ".pptx")
download.file(
paste0("https://s3-eu-west-1.amazonaws.com/",
"pfigshare-u-files/16252631/ari.pptx"),
destfile = pptx)
result = try({
pptx_to_ari(pptx, open = FALSE)
}, silent = TRUE)
soffice_config_issue = inherits(result, "try-error")
if (soffice_config_issue) {
ariExtra:::fix_soffice_library_path()
result = try({
pptx_to_ari(pptx, open = FALSE)
}, silent = TRUE)
}
if (!inherits(result, "try-error")) {
print(result[c("images", "script")])
}
}
#> Getting Notes from PPTX
#> Converting PPTX to PDF
#> Converting PDF to PNGs
#> Converting page 1 to /var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T//RtmpeDKFCU/file139a26a04a43a.png... done!
#> Converting page 2 to /var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T//RtmpeDKFCU/file139a256ba1236.png... done!
#> Making output_file directories
#> $images
#> [1] "/private/var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T/RtmpeDKFCU/file139a25201c3b5_files/slide_00001.png"
#> [2] "/private/var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T/RtmpeDKFCU/file139a25201c3b5_files/slide_00002.png"
#>
#> $script
#> [1] "Sometimes it’s hard for an instructor to take the time to record their lectures. For example, I’m in a coffee shop and it may be loud."
#> [2] "Here is an example of a plot with really small axes. We plot the x versus the y-variables and a smoother between them."
This can be passed to ari_spin
.
For Google Slides, the slide deck can be downloaded as a PowerPoint and the previous steps can be used, however it can also be downloaded directly as a PDF. We will use the same presentation, but uploaded to Google Slides. The ariExtra package has the function gs_to_ari
to wrap this functionality (as long as link sharing is turned on), where we can pass the Google identifier:
gs_doc = ariExtra::gs_to_ari("14gd2DiOCVKRNpFfLrryrGG7D3S8pu9aZ")
#> Converting page 1 to /var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T//RtmpeDKFCU/file139a263bce78.png... done!
#> Converting page 2 to /var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T//RtmpeDKFCU/file139a240702b7a.png... done!
gs_doc[c("images", "script")]
#> $images
#> [1] "/private/var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T/RtmpeDKFCU/file139a243677a3_files/slide_00001.png"
#> [2] "/private/var/folders/1s/wrtqcpxn685_zk570bnx9_rr0000gr/T/RtmpeDKFCU/file139a243677a3_files/slide_00002.png"
#>
#> $script
#> [1] "Sometimes it’s hard for an instructor to take the time to record their lectures. For example, I’m in a coffee shop and it may be loud."
#> [2] "Here is an example of a plot with really small axes. We plot the x versus the y-variables and a smoother between them."
Note, as Google provides a PDF version of the slides, this obviates the LibreOffice dependency.
Alternatively, the notes can be extracted using rgoogleslides
and for Google Slides via the API, but requires authentication, so we will omit it here. Thus, we should be able to create videos using R Markdown, Google Slides, or PowerPoint presentations in an automatic fashion.