'mount_workspace_dir notebook magic not working in EMR Studio

In an EMR Studio Python3 notebook, I execute the following:

%mount_workspace_dir .

And receive the following error:

UsageError: Line magic function `%mount_workspace_dir` not found.

I setup the EMR Cluster for studio using a Cloud Formation template that is accessible to Studio via Service Catalog. The Cloud Formation template specifies a bootstrap script that installs s3fs-fuse. The template also specifies a step to be executed when the cluster launches that installs emr-notebooks-magics using pip.

When the cluster launches, I execute the above %mount_workspace_dir command and receive the indicated error. I tried restarting the kernel as well using the Kernel->Restart Kernel option from the menu.

Here is the Cloud Formation template (with substitutions for subnet and bucket names):

---
AWSTemplateFormatVersion: 2010-09-09

Parameters:
  SubnetId:
    Type: "String"

Resources:
  EmrCluster:
    Type: AWS::EMR::Cluster
    Properties:
      Applications:
        - Name: Spark
        - Name: Livy
        - Name: JupyterEnterpriseGateway
        - Name: Hive
        - Name: Presto
      EbsRootVolumeSize: '50'
      Name: !Join ['-', ['emr-studio-', !Select [4, !Split ['-', !Select [2, !Split ['/', !Ref AWS::StackId]]]]]]
      JobFlowRole: emr-studio-instance-role
      ServiceRole: EMR_DefaultRole
      ReleaseLabel: "emr-6.3.0"
      VisibleToAllUsers: true
      LogUri:
        Fn::Sub: 's3://<my-bucket>/'
      Instances:
        TerminationProtected: false
        Ec2SubnetId: '<my-subnet>'
        MasterInstanceGroup:
          InstanceCount: 1
          InstanceType: "m5.xlarge"
      BootstrapActions:
      - Name: Auto-Termination
        ScriptBootstrapAction:
          Path: "s3://<my-bucket>/scripts/bootstrap-actions/install-s3fs-fuse.sh"
      Steps:
      - Name: Enable-Notebooks-Magics
        ActionOnFailure: CONTINUE
        HadoopJarStep:
          Jar: command-runner.jar
          Args:
          - "sudo"
          - "/mnt/notebook-env/bin/pip"
          - "install"
          - "emr-notebooks-magics"

Outputs:
  ClusterId:
    Value:
      Ref: EmrCluster
    Description: The ID of the EMR Cluster

Here is the content of the install-s3fs-fuse.sh script:

sudo amazon-linux-extras install epel -y
sudo yum install s3fs-fuse -y

I also tried with EMR 6.5.0.

Is there a step that I'm missing?



Solution 1:[1]

It appears there is a bug in the setup.py script of emr-notebooks-magics where it does not copy the 001-setup-emr-notebook-magics.py script to the correct location. I needed to add the following steps to make it work:

  Steps:
  - Name: Enable-Notebooks-Magics
    ActionOnFailure: CONTINUE
    HadoopJarStep:
      Jar: command-runner.jar
      Args:
      - "sudo"
      - "/mnt/notebook-env/bin/pip3"
      - "install"
      - "emr-notebooks-magics"
  - Name: Copy-Magics-Script
    ActionOnFailure: CONTINUE
    HadoopJarStep:
      Jar: command-runner.jar
      Args:
      - "sudo"
      - "cp"
      - "/mnt/notebook-env/bin/001-setup-emr-notebook-magics.py"
      - "/home/emr-notebook/.ipython/profile_default/startup/"

Solution 2:[2]

You can update the EMR step script to install the package as emr-notebook user so that start up file is copied to the right directory.

sudo -u emr-notebook /mnt/notebook-env/bin/pip install emr-notebooks-magics

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 dennislloydjr
Solution 2 cyn0