Welcome to Globus-JupyterLab’s documentation!
Globus Jupyterlab is an extension to JupyterLab for submitting Globus Transfers within a running JupyterLab environment. The integration enables the user to initiate a Globus transfer from within the JupyterLab file manager.
The extension is installable via pip and is suitable for both local environments such as a workmachine or laptop, or, alternatively, on top of a Single User Server when running in a JupyterHub environment like Zero-to-JupyterHub
Installation
Globus JupyterLab requires Python 3.7 or higher. For a modern version of python, see the official Python Installation Guide.
With pip installed, you can do the following:
pip install globus-jupyterlab
Globus Connect Personal
Users can transfer to or from any accessible Globus endpoint or collection. As a convenience, Globus JupyterLab will automatically detect a local, running Globus Connect Personal endpoint. Globus Connect Personal may be downloaded from the Globus web application. https://app.globus.org/file-manager/gcp
JupyterHub
For the most part, running JupyterLab in a Hub environment is the same as running JupyterLab locally on a workstation.
Users can transfer to or from any accessible Globus endpoint or collection. As a convenience, Globus JupyterLab will automatically detect a local, running Globus Connect Personal endpoint. Globus Connect Personal may be downloaded from the Globus web application. https://app.globus.org/file-manager/gcp
Globus Connect Server collections cannot be determined automatically. These collections will need to be speficied manually either directly as env variables in the environment, or by OAuthenticator
For example:
export GLOBUS_COLLECTION_ID='MyCollectionUUID'
See Config Reference for a full list of config options.
Mapped Collections
Support is coming soon!
Guest Collections
Guest Collections are typically mounted filesystems over NFS. The same file viewed by the user in Globus JupyterLab may have a differenet path viewed through Globus Connect Server.
For example, a GCS share may be mounted inside a single user server at /home/jovyan
. A file in a single user server in Globus
JupyterLab will be /home/jovyan/foo.txt
, but can only be accessed from the Globus Collection as /foo.txt
.
Setting GLOBUS_HOST_POSIX_BASEPATH
to /home/jovyan
fixes this mismatch. Now when Globus JupyterLab submits a transfer,
paths will be translated to “GCS” paths, transferring /foo.txt
instead of /home/jovyan/foo.txt
.
GLOBUS_HOST_COLLECTION_BASEPATH
is also available if you want Globus JupyterLab to transfer files to a subfolder inside
a Guest Collection share.
See Config Reference for more info on GLOBUS_HOST_POSIX_BASEPATH
and GLOBUS_HOST_COLLECTION_BASEPATH
.
Warning
User tokens are stored in the user’s home directory by default. This path needs to be changed if the Guest Collection share could be visible to other users. See the Config Reference option GLOBUS_TOKEN_STORAGE_PATH.
Kubernetes
The Zero-To-JupyterHub the single-user-server is typically run on a pod separate from the hub, and so needs to be configured accordingly. See the User Environment Documentation
singleuser:
extraEnv:
GLOBUS_COLLECTION_ID: "MyCollectionUUID"
hub:
extraConfig:
10-set-local-globus-collection: |
# This is only possible if users login via the Globus OAuthenticator.
# GLOBUS_COLLECTION_ID will take precedence if both are present.
c.OAuthenticator.globus_local_endpoint = '1346ef68-d9b8-4757-a537-47cefb7698e8'
Customized Transfer Submissions
By default, JupyterLab submits transfer requests directly to Globus Transfer. This behavior is customizable such that JupyterLab submits to a third-party Globus Resource Server instead. This is useful when a third-party app needs to submit the transfer request to Globus Transfer.
export GLOBUS_TRANSFER_SUBMISSION_URL='https://myservice/submit-transfer'
export GLOBUS_TRANSFER_SUBMISSION_SCOPE='my_custom_globus_scope'
export GLOBUS_TRANSFER_SUBMISSION_IS_HUB_SERVICE=true
With these settings configured, Jupyterlab will request the configured scope above on first login, in addition to the original transfer
scope. When a user requests a transfer, the request will be submitted to the custom URL
above instead of to Globus Transfer,
with the following request:
{
"transfer": {
"source_endpoint": "ddb59aef-6d04-11e5-ba46-22000b92c6ec",
"destination_endpoint": "ddb59af0-6d04-11e5-ba46-22000b92c6ec",
"DATA": [
{
"source_path": "/share/godata/file1.txt",
"destination": "~/",
"recursive": false
},
{
"source_path": "/foo/bar",
"destination": "~/bar",
"recursive": true
}
]
}
}
The custom request is expected to return the following response:
{
"task_id": “abcdeaef-6d04-11e5-ba46-22000b92c6ec"
}
The task ID returned by the service will be used to monitor the task in Globus.
Config Reference
- class globus_jupyterlab.globus_config.GlobusConfig
Bases:
object
Track all Globus Related information related to the Globus JupyterLab server extension. Many settings can be re-configured via environment variables where JupyterLab is being run. For example:
$ export GLOBUS_REFRESH_TOKENS=true $ jupyter lab
- get_refresh_tokens() bool
Should JupyterLab use Refresh tokens? Default is False. When True, JupyterLab will automatically refresh access tokens, eliminating the need for additional user authentications to refresh tokens.
Configurable via evironment variable: GLOBUS_REFRESH_TOKENS Default: false
Acceptable env values:
‘true’ – use refresh tokens
‘false’ – do not use refresh tokens
- get_token_storage_path() str
Modify the default path of token storage for Globus JupyterLab. This location MUST be only accessible by the logged in Globus User.
Configurable via evironment variable: GLOBUS_TOKEN_STORAGE_PATH
Default is: ~/.globus_jupyterlab_tokens.json
“~” Expands to the local POSIX user, on JupyterHub this is /home/jovyan
- get_named_grant() str
Set a custom Named Grant when a user logs into Globus. Changes the pre-filled text displayed on the Globus Consent page when logging in.
Configurable via evironment variable: GLOBUS_NAMED_GRANT
- get_collection_id() str
Configure the Globus Collection used by JupyterLab. By default, this will check for collections in the following order:
A GLOBUS_COLLECTION_ID environment variable
A local Globus Connect Personal Collection (GCP is installed)
Environment Variable set by OAuthenticator (GLOBUS_LOCAL_ENDPOINT)
If a Globus Collection is not found, transfers cannot be submited by JupyterLab.
Configurable via environment variable: GLOBUS_COLLECTION_ID
- get_collection_path() str
Configure the base path for the local Globus Collection. By default, this path will assume the environment is a mapped collection or local user environment where ~ corresponds to the local user home directory. The path is pre-pended to all paths for files/dirs selected within JupyterLab prior to transfer.
Note
Local JupyterLab paths are not cross-checked with paths on a Globus Endpoint prior to tranfer. If there is a mismatch between the base paths for each, transfers will either fail or encounter FileNotFound errors.
Configurable via environment variable: GLOBUS_COLLECTION_PATH
- get_host_posix_basepath() str
If JupyterLab is generating incorrect paths for transfer on a Gloubs Collection, this setting will ‘fix’ them during transfers to ensure the path within POSIX and the path visible through the Gloubs Collection point to the same file. For example, if the Host Globus collection was mounted at
/home/jovyan
, JupyterLab and the Host collection would refer to the same file with two separate paths:JupyterLab (POSIX): /home/jovyan/foo.txt
Collection (Globus): /foo.txt
Setting “GLOBUS_HOST_POSIX_BASEPATH=/home/jovyan” will ensure a file transferred by JupyterLab “/home/jovyan/foo.txt” will be rewritten to “foo.txt” on transfer, such that the Globus Transfer can complete with the correct path.
By default when blank or unset, no path translation takes place.
- get_host_collection_basepath() str
Similar to GLOBUS_HOST_POSIX_BASEPATH, this will prepend a base path on a Globus Collection which isn’t visible from JupyterLab (POSIX)
JupyterLab (POSIX): foo.txt
Collection (Globus): /shared/foo.txt
You may set “GLOBUS_HOST_COLLECTION_BASEPATH=/shared”. This will ensure a file transferred by JupyterLab “foo.txt” will be rewritten to “/shared/foo.txt” on transfer, such that the Globus Transfer can complete with the correct path.
By default when blank or unset, no path translation takes place. This setting can be used with or without GLOBUS_HOST_POSIX_BASEPATH.
- get_transfer_submission_url() str
By default, JupyterLab will start transfers on the user’s behalf using the Globus Transfer API directly. Configure this to instead use a custom Globus Resource Server for submitting transfers on the user’s behalf.
Note: GLOBUS_TRANSFER_SUBMISSION_SCOPE must also be configured.
Configurable via evironment variable: GLOBUS_TRANSFER_SUBMISSION_URL
- get_transfer_submission_scope() str
Define a custom ‘transfer submission’ scope for submitting user transfers. Used in conjunction with GLOBUS_TRANSFER_SUBMISSION_URL. Includes a custom scope to use when logging in and submitting transfers. Transfers submitted to the custom URL will be authorized with the access token for this custom scope instead of a Globus Transfer access token.
Configurable via evironment variable: GLOBUS_TRANSFER_SUBMISSION_SCOPE
- get_transfer_submission_is_hub_service() bool
Defines how JupyterLab should authorize with the custom submission service. If the Globus Resource Server is embedded inside a hub service, set this to ‘true’ in order to use the ‘hub’ token for authorization with the hub (Hub token will be passed in the header under Authorization). The Globus token will be passed instead in POST data.
If false, submission will not use the hub token, and assume the remote service is a normal Globus resource server, and pass the token in the header under the name “Authorization”.
Configurable via evironment variable: GLOBUS_TRANSFER_SUBMISSION_IS_HUB_SERVICE
Acceptable env values:
‘true’ - use refresh tokens
‘false’ - do not use refresh tokens
Changelog
All notable changes to this project will be documented in this file. See standard-version for commit guidelines.
1.0.0-beta.10 (2023-02-07)
Bug Fixes
1.0.0-beta.9 (2022-08-30)
Features
show required action to user if activation is required (0380bde)
Bug Fixes
1.0.0-beta.8 (2022-07-21)
Features
Bug Fixes
1.0.0-beta.7 (2022-07-12)
Bug Fixes
1.0.0-beta.6 (2022-06-24)
Bug Fixes
Regression with additional GCS v5.4 required logins generating incorrect login URLs (5b25909)
1.0.0-beta.5 (2022-06-23)
Bug Fixes
1.0.0-beta.4 (2022-06-07)
Bug Fixes
Alert dismiss not working (527daf8)
login url for hub login (c462d78)
multiple logins, transfer, collection types (3ddc1c0)
prettier pre-commit, error status, and use standard node path for transfer (2041175)
Respond to GCS S3 collection “Credentials Required” errors (86012f0)
Transfer Submission not properly responding to auth exceptions (4af8fe5)
1.0.0-beta.3 (2022-05-20)
Features
Bug Fixes
Auth improperly reporting logged-in state after tokens expired (0cb13e2)
Filter out non-functional endpoints from endpoint searches (5624c9a)
Frontend not prompting for login when required (7e03910)
Hide hidden files by default (28fb701)
improper 400 returned by endpoint_search (ff4d996)
Revert minimum required jupyterlab family to 3.1.0 for compatibility (46433c4)
1.0.0-beta.2 (2022-05-02)
Features
Make Globus Collection ID/path configurable (ddba2e7)
Bug Fixes
1.0.0-beta.1 (2022-04-25)
⚠ BREAKING CHANGES
Prune old globus-jupyterlab app
Features
Add basic server extension handler api (3afb240)
Add better failover support for operations requiring data_access (84b88ca)
Add endpoint_autoactivate, minor refactor for wrapping sdk posts (ca6139e)
Add Login manager for saving/loading toknes (3075c08)
Add logout to revoke user tokens (036ec95)
Add operation_ls and endpoint_search server-extension endpoints (a5494c0)
Add submit_transfer api endpoint (98e4257)
Add support for using custom resource servers (06fffd5)
Allow users to copy auth code when automation is unavailable (a04ca7f)
Make Globus Collection ID/path configurable (ddba2e7)
Support Mapped collections via re-login with data_access scope (5c7726e)
Bug Fixes
is_hub()
for the config not returning boolean responses (4c2eaad)Add missing style/index.js (b8544c8)
Bug when fetching collection id (2492556)
Earlier reference to older transfer document schema (99331a1)
extension bug if GCP owner info is not available (5309e08)
incorrect name in setup.py (3efed70)
is_gcp() possibly returning true with custom configured collection (b70189d)
jupyter labextension develop . –overwrite not working (a48cb81)
Old v3 refs causing errors on startup (80b1a59)
Remove tsconfig.spec.json to fix build (3c05583)
server extension hiding manual copy-code step (b88cf93)
Transfers not submitting correctly (09cfe45)
Prune old globus-jupyterlab app (f3ab5c0)