S3¶
The S3 sandbox type can be used to access Amazon S3 buckets. The S3 sandbox is presented to the user as any other sandbox would, and thus is transparent.
Configuration Options¶
- type
- Type of the sandbox. For S3 type, set to
S3
. - path
Prefix where user should be sandboxed.
If path is not specified, bucket root is assumed. This is equivalent to specifying
/
.- accessKeyId
- Access Key ID to use when accessing the bucket.
- secretKey
- Secret Key to use when accessing the bucket.
- sessionToken
- Session token used with temporary credentials.
- bucketName
- Bucket name to access. The bucket must already exist.
- region
Region the bucket is in. Defaults to us-east-1.
Note
The specified region must match the bucket’s region. If not, the
endpointOverride
parameter must be used to point to the correct endpoint.- scheme
- Scheme to use when accessing the endpoint. Must be one of
https
(default) orhttp
. This is useful when specifyingendpointOverride
. - endpointOverride
- Endpoint to use when accessing the bucket.
- proxyScheme
- Scheme to use when using a proxy server. Must be one of
https
orhttp
(default). This is useful when specifyingproxyHost
. - proxyHost
- Hostname of the proxy server.
- proxyPort
- Port for the proxy server.
- proxyUserName
- Username to authenticate against the proxy server.
- proxyPassword
- Password to authenticate against the proxy server.
- verifySSL
- Control if SSL connections should be verified. Use with caution.
- caPath
- Path to directory containing CA certificates.
- caFile
- Path to the file containing CA certificates.
Note
JetStream uses S3 version 2 API, including multipart uploads. When accessing third party endpoints, please ensure they are compatible with these requirements.
Warning
In some cases, when transfers fail, JetStream can leave incomplete multipart uploads in the S3 bucket. By default, these uploads are not visible to users, and can take up space in the bucket. It is recommended that a bucket lifecycle rule is created for incomplete multipart uploads.
Memory Requirements¶
When storing data to S3 buckets, JetStream Server needs to do additional file caching. The size of file cache affects how many concurrent transfers can be stored in S3 at a time, as do the quality and conditions of the connection between the JetStream server and the S3 endpoint. The cache is not pre-allocated and will be allocated dynamically as necessary. Each JetStream transfer will reserve a small portion of the file cache for upload to or download from the S3 storage.
Size of the file cache can be adjusted using --max-cloud-cache-size
option, and the per-transfer limits can be specified using the --max-cloud-cache-upload-size
and --max-cloud-cache-download-size
options.
For a larger cache, a cache file can be specified using --cloud-cache-file
. If a cache file is specified, the server will use the file and not require additional memory. The given file should be on fast local storage, must be at least --max-cloud-cache-size
bytes big, and should not be modified or deleting while the server is running.
Limitations¶
- Files uploaded to S3 must be under one (1) Terabyte in size.
- Objects named
.
and..
are ignored. overwriteMode
increateTransfer()
is ignored, andOVERWRITE_IN_PLACE_ALWAYS
is assumed.- Uploads to S3 do not support checkpoints. Interrupted transfers may restart from beginning.
Example¶
To sandbox all users into a bucket named sandboxBucket
, you could use the following sandboxing configuration:
>>> api.server.setSandboxMapping('', {
'': {
'type': S3',
'bucketName': 'sandboxBucket',
'path': '/',
'region': 'us-west-2',
'accessKeyId': 'SECRETKEYID',
'secretKey': 'secretkey'
}
})