Chapter 8 Student instructions

Modules aimed at students in a course or workshop.


8.1 Student Account Setup

In order to run your analyses, you will use the AnVIL cloud computing platform, so that you do not need to install everything on your own computer. The AnVIL (Analysis Visualization and Informatics Lab-space) platform is specially designed for analyzing biological data, and is used by scientists doing all sorts of biological research.

AnVIL in a nutshell

  • Behind the scenes, AnVIL relies on Google Cloud Platform to provide computing infrastructure. Basically, AnVIL lets you “rent” computers from Google (remotely). Whenever you run an analyses on AnVIL, it actually runs on one of Google’s computers, and AnVIL lets you see the results in your browser.
  • AnVIL uses Terra to provide many computational tools useful for biological data analysis, such as RStudio, Galaxy, and Jupyter Notebooks. Terra takes care of installing these tools on Google’s computers, so that you can just start using them.

8.1.1 Create Google Account

First, you will need to set up a (free) Google account.

If you do not already have a Google account that you would like to use for accessing AnVIL, create one now.

  • Alternatively, if you would like to create a Google account that is associated with an existing non-Gmail email address, you can follow these instructions.

8.1.2 Log In to Terra

Next, make sure you can log in to Terra – you will use Terra to perform computations on AnVIL.

You can access Terra by going to anvil.terra.bio, or by clicking the link on the AnVIL home page.

Screenshot of the AnVIL home page. The section descring Terra is highlighted.

Open Terra, and you should be prompted to sign in with your Google account.

8.1.3 Share Username

Finally, make sure your instructor has your Google account username (e.g. myname@gmail.com), so they can give you access to everything you need.

  • Make sure there are no typos!
  • If you have multiple Google accounts, make sure you give them the username that you will be using to access AnVIL

It is very important that you share the Google account you will be using to access AnVIL with with your instructor! Otherwise, the instructor cannot add you to Billing Projects or Workspaces, and you will be unable to proceed with your assignments.

8.2 Student instructions for cloning a Workspace

These instructions can be customized to a specific workspace by setting certain variables before running cow::borrow_chapter(). If these variables have not been set, reasonable defaults are provided (e.g. “ask your instructor”).

8.2.1 With no variables set:

This will not work until your instructor has given you permission to spend money to “rent” the computers that will power your analyses (by adding you to a “Billing Project”).

On AnVIL, you access files and computers through Workspaces. Each Workspace functions almost like a mini code laboratory - it is a place where data can be examined, stored, and analyzed. The first thing we want to do is to copy or “clone” a Workspace to create a space for you to experiment. This will give you access to

  • the files you will need (data, code)
  • the computing environment you will use

Tip At this point, it might make things easier to open up a new window in your browser and split your screen. That way, you can follow along with this guide on one side and execute the steps on the other.

To clone an AnVIL Workspace:

  1. Open Terra - use a web browser to go to anvil.terra.bio

  2. In the drop-down menu on the left, navigate to “Workspaces”. Click the triple bar in the top left corner to access the menu. Click “Workspaces”.

    Screenshot of Terra drop-down menu.  The "hamburger" button to extend the drop-down menu is highlighted, and the menu item "Workspaces" is highlighted.

  3. You are automatically directed to the “MY WORKSPACES” tab. Here you can see any Workspaces that have been shared with you, along with your permission level. Depending on how your instructor has set things up, you may or may not see any Workspaces in this tab.

    Screenshot of Terra Workspaces page with the "MY WORKSPACES" tab selected.  The "MY WORKSPACES" tab and the column showing permission level are highlighted.

  4. Locate the Workspace specified by your instructor. (The images below show the SARS-CoV-2-Genome Workspace as an example, but you should look for the Workspace specified by your instructor.)

    1. If it has been shared with you ahead of time, it will appear in “MY WORKSPACES”.

    Screenshot of Terra Workspaces page with the "MY WORKSPACES" tab selected. The "MY WORKSPACES" tab and a Workspace name are highlighted.

    1. Otherwise, select the “PUBLIC” tab. In the top search bar, type the Workspace name specified by your instructor.

    Screenshot of Terra Workspaces page with the "PUBLIC" tab selected. The "PUBLIC" tab and search box are highlighted.  The the user has typed in the term "sars". A Workspace related to SARS appears in the results.

    1. You can also go directly to the Workspace by clicking this link: ask your instructor.
  5. Clone the workspace by clicking the teardrop button (teardrop button). Select “Clone”. Or, if you have opened the Workspace, you can find the teardrop button on the top right of the Workspace.

    Screenshot showing the teardrop button. The button has been clicked revealing the "clone" option. The Clone option and the teardrop button are highlighted. Screenshot of the Dashboard for the Workspace that we want to clone. The teardrop button has been clicked to bring up the options. The "Clone" option from the list is highlighted.

  6. You will see a popup box appear, asking you to configure your Workspace

    1. Give your Workspace clone a name by adding an underscore (“_“) and your name. For example, "ExampleWorkspace_Firstname_Lastname".
    2. Select the Billing Project provided by your instructor.
    3. Leave the bottom two boxes as-is.
    4. Click “CLONE WORKSPACE”.

    Screenshot showing the "clone a workspace" popout. The Workspace name, Billing Project, and Clone Workspace button have been filled in and highlighted.

  7. The new Workspace should now show up under “MY WORKSPACES”. You now have your own copy of the Workspace to work in.

8.2.2 With variables set:

This will not work until your instructor has given you permission to spend money to “rent” the computers that will power your analyses (by adding you to a “Billing Project”).

On AnVIL, you access files and computers through Workspaces. Each Workspace functions almost like a mini code laboratory - it is a place where data can be examined, stored, and analyzed. The first thing we want to do is to copy or “clone” a Workspace to create a space for you to experiment. This will give you access to

  • the files you will need (data, code)
  • the computing environment you will use

Tip At this point, it might make things easier to open up a new window in your browser and split your screen. That way, you can follow along with this guide on one side and execute the steps on the other.

To clone an AnVIL Workspace:

  1. Open Terra - use a web browser to go to anvil.terra.bio

  2. In the drop-down menu on the left, navigate to “Workspaces”. Click the triple bar in the top left corner to access the menu. Click “Workspaces”.

    Screenshot of Terra drop-down menu.  The "hamburger" button to extend the drop-down menu is highlighted, and the menu item "Workspaces" is highlighted.

  3. You are automatically directed to the “MY WORKSPACES” tab. Here you can see any Workspaces that have been shared with you, along with your permission level. Depending on how your instructor has set things up, you may or may not see any Workspaces in this tab.

    Screenshot of Terra Workspaces page with the "MY WORKSPACES" tab selected.  The "MY WORKSPACES" tab and the column showing permission level are highlighted.

  4. Locate the Workspace Example_Workspace. (The images below show the SARS-CoV-2-Genome Workspace as an example, but you should look for the Workspace Example_Workspace.)

    1. If it has been shared with you ahead of time, it will appear in “MY WORKSPACES”.

    Screenshot of Terra Workspaces page with the "MY WORKSPACES" tab selected. The "MY WORKSPACES" tab and a Workspace name are highlighted.

    1. Otherwise, select the “PUBLIC” tab. In the top search bar, type the Workspace name Example_Workspace.

    Screenshot of Terra Workspaces page with the "PUBLIC" tab selected. The "PUBLIC" tab and search box are highlighted.  The the user has typed in the term "sars". A Workspace related to SARS appears in the results.

    1. You can also go directly to the Workspace by clicking this link: http://example.com/.
  5. Clone the workspace by clicking the teardrop button (teardrop button). Select “Clone”. Or, if you have opened the Workspace, you can find the teardrop button on the top right of the Workspace.

    Screenshot showing the teardrop button. The button has been clicked revealing the "clone" option. The Clone option and the teardrop button are highlighted. Screenshot of the Dashboard for the Workspace that we want to clone. The teardrop button has been clicked to bring up the options. The "Clone" option from the list is highlighted.

  6. You will see a popup box appear, asking you to configure your Workspace

    1. Give your Workspace clone a name by adding an underscore (“_“) and your name. For example, "Example_Workspace_Firstname_Lastname".
    2. Select the Billing Project Example Billing Project.
    3. Leave the bottom two boxes as-is.
    4. Click “CLONE WORKSPACE”.

    Screenshot showing the "clone a workspace" popout. The Workspace name, Billing Project, and Clone Workspace button have been filled in and highlighted.

  7. The new Workspace should now show up under “MY WORKSPACES”. You now have your own copy of the Workspace to work in.

8.3 Student instructions for launching Jupyter

The module below is specially customized for students, allowing you to give more specific instructions on the settings for their Jupyter environment. There are several other general purpose modules that may also be useful for students (e.g. Pausing Jupyter, Deleting Jupyter) that can be found in other chapters of this book.

The following instructions can be customized by setting certain variables before running cow::borrow_chapter(). Developers should create these variables as a list AnVIL_module_settings. The following variables can be provided:

  • audience = Defaults to general, telling them to use the default Jupyter settings. If audience is set to student, it gives more specific instructions.
  • docker_image = Optional, it will tell them how to set the image.
  • startup_script = Optional, it will tell them how to set the script.

8.3.1 Using default Jupyter environment:

AnVIL is very versatile and can scale up to use very powerful cloud computers. It’s very important that you select the cloud computing environment described here to avoid runaway costs.

  1. Open Terra - use a web browser to go to anvil.terra.bio

  2. In the drop-down menu on the left, navigate to “Workspaces”. Click the triple bar in the top left corner to access the menu. Click “Workspaces”.

    Screenshot of Terra drop-down menu.  The "hamburger" button to extend the drop-down menu is highlighted, and the menu item "Workspaces" is highlighted.

  3. Click on the name of your Workspace. You should be routed to a link that looks like: https://anvil.terra.bio/#workspaces/<billing-project>/<workspace-name>.

  4. Click on the cloud icon on the far right to access your Cloud Environment options.

    Screenshot of a Terra Workspace. The cloud icon to create a new cloud environment is highlighted.

  5. In the dialogue box, click the “Settings” button under Jupyter.

    Screenshot of the Cloud Environment Details dialogue box. The Settings button under Jupyter is highlighted.

  6. You will see some configuration options for the Jupyter cloud environment, and a list of costs because it costs a small amount of money to use cloud computing.

    Screenshot of the Jupyter Cloud Environment dialogue box. The cost to run the environment is highlighted.

  7. Leave everything else as-is. To create your Jupyter Cloud Environment, scroll down and click the “CREATE” button.

    Screenshot of the Jupyter Cloud Environment dialogue box. The "CREATE" button is highlighted.

  8. The dialogue box will close and you will be returned to your Workspace. You can see the status of your cloud environment by hovering over the Jupyter icon. It will take a few minutes for Terra to request computers and install software.

    Screenshot of a Terra Workspace. The hovertext for the Jupyter icon is highlighted, and indicates that the status of the environment is "Creating".

  9. When your environment is ready, its status will change to “Running”. Click on the “ANALYSES” tab to create or open a Jupyter Notebook.

    Screenshot of a Terra Workspace. The hovertext for the Jupyter icon is highlighted, and indicates that the status of the environment is "Running".  The ANALYSES tab is also highlighted

  10. From the ANALYSES tab, you can click on the name of an existing Jupyter Notebook to view and launch it, or click the “START” button to create a new Notebook.

    Screenshot of Terra Workspace with the "ANALYSES" tab selected and highlighted.  The page shows a list of Jupyter Notebooks.  The Notebook names and the START button are highlighted.

8.3.2 Using custom Jupyter environment:

AnVIL is very versatile and can scale up to use very powerful cloud computers. It’s very important that you select the cloud computing environment described here to avoid runaway costs.

  1. Open Terra - use a web browser to go to anvil.terra.bio

  2. In the drop-down menu on the left, navigate to “Workspaces”. Click the triple bar in the top left corner to access the menu. Click “Workspaces”.

    Screenshot of Terra drop-down menu.  The "hamburger" button to extend the drop-down menu is highlighted, and the menu item "Workspaces" is highlighted.

  3. Click on the name of your Workspace. You should be routed to a link that looks like: https://anvil.terra.bio/#workspaces/<billing-project>/<workspace-name>.

  4. Click on the cloud icon on the far right to access your Cloud Environment options.

    Screenshot of a Terra Workspace. The cloud icon to create a new cloud environment is highlighted.

  5. In the dialogue box, click the “Settings” button under Jupyter.

    Screenshot of the Cloud Environment Details dialogue box. The Settings button under Jupyter is highlighted.

  6. You will see some configuration options for the Jupyter cloud environment, and a list of costs because it costs a small amount of money to use cloud computing.

    Screenshot of the Jupyter Cloud Environment dialogue box. The cost to run the environment is highlighted.

  7. Under “Application configuration” you will see a dropdown menu. Choose “Custom Environment”. Then copy the following link into “Container image” textbox:

    example docker

    Screenshot of the Jupyter Cloud Environment "Application configuration" dropdown. The option "Custom Environment" is highlighted.

    Screenshot of the Jupyter Cloud Environment dialog box. "Custom Environment" has been selected in the "Application configuration" dropdown menu, and the "Container image" textbox is highlighted.

  8. Under “Startup script” you will see textbox. Copy the following link into the box:

    example startup script

    Screenshot of the Jupyter Cloud Environment customization dialogue box. The textbox labeled "Startup script" is highlighted.

  9. Leave everything else as-is. To create your Jupyter Cloud Environment, scroll down and click the “CREATE” button.

    Screenshot of the Jupyter Cloud Environment dialogue box. The "CREATE" button is highlighted.

  10. The dialogue box will close and you will be returned to your Workspace. You can see the status of your cloud environment by hovering over the Jupyter icon. It will take a few minutes for Terra to request computers and install software.

    Screenshot of a Terra Workspace. The hovertext for the Jupyter icon is highlighted, and indicates that the status of the environment is "Creating".

  11. When your environment is ready, its status will change to “Running”. Click on the “ANALYSES” tab to create or open a Jupyter Notebook.

    Screenshot of a Terra Workspace. The hovertext for the Jupyter icon is highlighted, and indicates that the status of the environment is "Running".  The ANALYSES tab is also highlighted

  12. From the ANALYSES tab, you can click on the name of an existing Jupyter Notebook to view and launch it, or click the “START” button to create a new Notebook.

    Screenshot of Terra Workspace with the "ANALYSES" tab selected and highlighted.  The page shows a list of Jupyter Notebooks.  The Notebook names and the START button are highlighted.

8.4 Student instructions for launching RStudio

The module below is specially customized for students, allowing you to give more specific instructions on the settings for their RStudio environment. There are several other general purpose modules that may also be useful for students (e.g. Pausing RStudio, Deleting RStudio) that can be found in other chapters of this book.

The following instructions can be customized by setting certain variables before running cow::borrow_chapter(). Developers should create these variables as a list AnVIL_module_settings. The following variables can be provided:

  • audience = Defaults to general, telling them to use the default RStudio settings. If audience is set to student, it gives more specific instructions.
  • docker_image = Optional, it will tell them to open the customization dialogue and direct them on how to set the image.
  • startup_script = Optional, it will tell them to open the customization dialogue and direct them on how to set the script.

8.4.1 Using default RStudio environment:

AnVIL is very versatile and can scale up to use very powerful cloud computers. It’s very important that you select the cloud computing environment described here to avoid runaway costs.

  1. Open Terra - use a web browser to go to anvil.terra.bio

  2. In the drop-down menu on the left, navigate to “Workspaces”. Click the triple bar in the top left corner to access the menu. Click “Workspaces”.

    Screenshot of Terra drop-down menu.  The "hamburger" button to extend the drop-down menu is highlighted, and the menu item "Workspaces" is highlighted.

  3. Click on the name of your Workspace. You should be routed to a link that looks like: https://anvil.terra.bio/#workspaces/<billing-project>/<workspace-name>.

  4. Click on the cloud icon on the far right to access your Cloud Environment options.

    Screenshot of a Terra Workspace. The cloud icon to create a new cloud environment is highlighted.

  5. In the dialogue box, click the “Settings” button under RStudio.

    Screenshot of the Cloud Environment Details dialogue box. The Settings button under RStudio is highlighted.

  6. You will see some details about the default RStudio cloud environment, and a list of costs because it costs a small amount of money to use cloud computing.

    Screenshot of the RStudio Cloud Environment dialogue box. The cost to run the environment is highlighted.

  7. Click the “CREATE” button.

    Screenshot of the RStudio Cloud Environment dialogue box. The "CREATE" button is highlighted.

  8. The dialogue box will close and you will be returned to your Workspace. You can see the status of your cloud environment by hovering over the RStudio logo. It will take a few minutes for Terra to request computers and install software.

    Screenshot of a Terra Workspace. The hovertext for the RStudio icon is highlighted, and indicates that the status of the environment is "Creating".

  9. When your environment is ready, its status will change to “Running”. Click on the RStudio logo to open a new dialogue box that will let you launch RStudio.

    Screenshot of a Terra Workspace. The hovertext for the RStudio icon is highlighted, and indicates that the status of the environment is "Running".

  10. Click the launch icon to open RStudio. This is also where you can pause, modify, or delete your environment when needed.

    Screenshot of the RStudio Environment Details dialogue box. The "Open" button is highlighted.

  11. You should now see the RStudio interface with information about the version printed to the console.

    Screenshot of the RStudio environment interface.

8.4.2 Using custom RStudio environment:

AnVIL is very versatile and can scale up to use very powerful cloud computers. It’s very important that you select the cloud computing environment described here to avoid runaway costs.

  1. Open Terra - use a web browser to go to anvil.terra.bio

  2. In the drop-down menu on the left, navigate to “Workspaces”. Click the triple bar in the top left corner to access the menu. Click “Workspaces”.

    Screenshot of Terra drop-down menu.  The "hamburger" button to extend the drop-down menu is highlighted, and the menu item "Workspaces" is highlighted.

  3. Click on the name of your Workspace. You should be routed to a link that looks like: https://anvil.terra.bio/#workspaces/<billing-project>/<workspace-name>.

  4. Click on the cloud icon on the far right to access your Cloud Environment options.

    Screenshot of a Terra Workspace. The cloud icon to create a new cloud environment is highlighted.

  5. In the dialogue box, click the “Settings” button under RStudio.

    Screenshot of the Cloud Environment Details dialogue box. The Settings button under RStudio is highlighted.

  6. You will see some details about the default RStudio cloud environment, and a list of costs because it costs a small amount of money to use cloud computing.

    Screenshot of the RStudio Cloud Environment dialogue box. The cost to run the environment is highlighted.

  7. Click “CUSTOMIZE” to adjust the settings for your environment.

    Screenshot of the RStudio Cloud Environment dialogue box. The "CUSTOMIZE" button is highlighted.

  8. Under “Application configuration” you will see a dropdown menu. You can also enter text here. Copy the following link into the box:

    example docker

    Screenshot of the RStudio Cloud Environment customization dialogue box. The dropdown menu labeled "Application configuration" is highlighted.

  9. Under “Startup script” you will see textbox. Copy the following link into the box:

    example startup script

    Screenshot of the RStudio Cloud Environment customization dialogue box. The textbox labeled "Startup script" is highlighted.

  10. Leave everything else as-is. To create your RStudio Cloud Environment, click on the “CREATE” button.

    Screenshot of the RStudio Cloud Environment customization dialogue box. The "CREATE" button is highlighted.

  11. The dialogue box will close and you will be returned to your Workspace. You can see the status of your cloud environment by hovering over the RStudio logo. It will take a few minutes for Terra to request computers and install software.

    Screenshot of a Terra Workspace. The hovertext for the RStudio icon is highlighted, and indicates that the status of the environment is "Creating".

  12. When your environment is ready, its status will change to “Running”. Click on the RStudio logo to open a new dialogue box that will let you launch RStudio.

    Screenshot of a Terra Workspace. The hovertext for the RStudio icon is highlighted, and indicates that the status of the environment is "Running".

  13. Click the launch icon to open RStudio. This is also where you can pause, modify, or delete your environment when needed.

    Screenshot of the RStudio Environment Details dialogue box. The "Open" button is highlighted.

  14. You should now see the RStudio interface with information about the version printed to the console.

    Screenshot of the RStudio environment interface.