Mike Kiev - Fotolia
Filter for AWS objects with these PowerShell commands for S3
Because of S3's flat structure, it can be difficult to find your way around the service. Learn how to use PowerShell S3 commands to find the right AWS objects.
There are many ways to administer workloads on AWS. The main portal is one approach, but it's often slower than using a scripted option like PowerShell. This is especially true if you have a Windows administration background.
Amazon S3 is a core AWS offerings, so admins that work on the platform will need to know their way around the object storage service. In S3, data is stored in a flat structure, and the base unit of storage is called a bucket. There isn't a hierarchy like you see on a local drive. Instead, the S3 Console infers that by using the forward slash as a delimiter in each object's name.
In this tutorial, you'll learn how to use PowerShell scripts to filter objects in S3.
Get started with PowerShell for AWS
Install the AWS Tools for PowerShell module and set up your credentials in the user guide before you use PowerShell in Amazon S3.
Before you start to look for objects, you need to select a bucket. In PowerShell, the Get-S3Bucket cmdlet will return a list of buckets based on your credentials. The list will look something like this:
PS>Get-S3Bucket
CreationDate BucketName
------------ ----------
4/25/2019 11:07:18 AM blogdemo
4/25/2019 11:08:07 AM psgitbackup
If you have a large number of buckets, you can filter them by region using the region parameter.
Get-S3Bucket -Region us-west-2
You'll notice below I only have buckets in the U.S. West-2 region, and, therefore, all of my buckets are still displayed in this example:
PS>Get-S3Bucket -Region us-west-2
CreationDate BucketName
------------ ----------
4/25/2019 11:07:19 AM blogdemo
4/25/2019 11:08:07 AM psgitbackup
To find a list of all the AWS regions using PowerShell, simply run the Get-AWSRegion cmdlet.
Find objects directly
To retrieve information about objects in S3, use the Get-S3Object cmdlet. Be sure to specify the name of the bucket with the -BucketName parameter.
Get-S30Object -BucketName 'psgitbackup'
Now, if you run this command with just a bucket name, it will return all the objects in that bucket. If you want to narrow your results, there are a few helpful parameters you can use with the Get-S3Object cmdlet filter.
If you already know the name of the object you are looking for, you can specify it by key name, which identifies a specific object within a bucket. This value is case sensitive.
Get-S3Object -BucketName 'psgitbackup' -Key 'PSDropbox/.git/hooks/applypatch-msg.sample
This will return the desired S3 object.
PS>Get-S3Object -BucketName 'psgitbackup' -Key 'PSDropbox/.git/hooks/applypatch-msg.sample'
ETag : "ce562e08d8098926a3862fc6e7905199"
BucketName : psgitbackup
Key : PSDropbox/.git/hooks/applypatch-msg.sample
LastModified : 4/25/2019 11:11:11 AM
Owner : Amazon.S3.Model.Owner
Size : 478
StorageClass : STANDARD
The good news is that you don't have to remember all of your object's keys in S3; you can do some filtering using the same Get-S3Object cmdlet.
Filter by prefix
With the same Get-S3Object cmdlet, you can specify a prefix that only returns objects that start with a certain value. This is akin to listing the contents of a directory, but different if you have folders with similar names.
To only return objects that have the "PSDropbox" prefix -- which, in my case, is a folder -- add the trailing '/'.
Get-S3Object -BucketName 'psgitbackup' -KeyPrefix 'PSDropbox/'
This will return all objects in your S3 bucket that have the specified prefix. Keep in mind that -KeyPrefix is case-sensitive.
PS>Get-S3Object -BucketName 'psgitbackup' -KeyPrefix 'PSDropbox/'
ETag : "67e3be899c68d6ff7ac60673a6fe1681"
BucketName : psgitbackup
Key : PSDropbox/.git/COMMIT_EDITMSG
LastModified : 4/25/2019 11:10:58 AM
Owner : Amazon.S3.Model.Owner
Size : 51
StorageClass : STANDARD
Filter by delimiter
You can also use the -Delimiter parameter to sort S3 objects. The default delimiter in S3 is '/', which the console will use to define folders when looking at S3 through the web UI. It's a similar experience in PowerShell with the -Delimiter parameter.
For instance, if you wanted to list all of the objects in the root of the S3 bucket, but none that are in any subdirectories, simply use the following:
PS>Get-S3Object -BucketName psgitbackup -Delimiter '/'
ETag : "2c6b80682f93da00c3415286b4df174a"
BucketName : psgitbackup
Key : PSLog-2019-04-16.log
LastModified : 5/16/2019 11:50:38 AM
Owner : Amazon.S3.Model.Owner
Size : 1054836
StorageClass : STANDARD
ETag : "ca7495a011c5e9a33183bffec5ae415f"
BucketName : psgitbackup
Key : PSLog-2019-04-17-1.log
LastModified : 5/16/2019 11:50:36 AM
Owner : Amazon.S3.Model.Owner
Size : 114532
StorageClass : STANDARD
ETag : "d54d7bc536dd03a33da4ef598b4709c6"
BucketName : psgitbackup
Key : PSLog-2019-04-17.log
LastModified : 5/16/2019 11:50:38 AM
Owner : Amazon.S3.Model.Owner
Size : 1449299
StorageClass : STANDARD
You can see that there are only three objects in the root of my S3 bucket. I can even verify that by looking at the console.
The S3 module in PowerShell doesn't return any folders that were also in the root. You can see them in the above screenshot, but they aren't returned by default with the Get-S3Object cmdlet. To retrieve those folders, use this command right after the previous one.
$AWSHistory.LastCommand.Responses.Last.CommonPrefixes
This will return the "CommonPrefixes" -- in this case, the folders that weren't returned by default.
PS>$AWSHistory.LastCommand.Responses.Last>CommonPrefixes
PSAirTable/
PSDropBox/
PSSherpaDesk/
PS_PDFGeneratorAPI/
PoShPal/
PowerTrello/
PowereBay/
You can also use the -Delimiter parameter to filter out subfolders that you may not want to see. For instance, since these subfolders are all Git repositories, you might not want to see the ".git" folder in the output. So if you specify a delimiter of ".git" then they won't display.
Get-S3Object -BucketName 'psgitbackup' -Delimiter '.git'
As the command count below demonstrates, filtering out subfolders can significantly reduce the output.
PS>(Get-S3Object -BucketName 'psgitbackup' -Delimiter '.git').count
180
PS>(Get-S3Object -BucketName 'psgitbackup').count
1702
Combine prefixes and delimiters
If you combine both prefixes and delimiters, you can easily traverse an S3 bucket similar to how you traverse a file system.
For instance, to find all the objects in a subdirectory, run the following:
Get-S3Object -BucketName psgitbackup -Prefix 'PSDropbox/src/' -Delimiter '/'
Note the trailing '/' on the -Prefix parameter. Without this, the command won't return anything.
PS>Get-S3Object -BucketName psgitbackup -Prefix 'PSDropbox/src/' -Delimiter '/'
ETag : "313ac20f13ee65e47e3d395b67d965e6"
BucketName : psgitbackup
Key : PSDropbox/src/PSDropbox.psd1
LastModified : 4/25/2019 11:10:59 AM
Owner : Amazon.S3.Model.Owner
Size : 7756
StorageClass : STANDARD
Then you can run CommonPrefixes again to see what folders or directories are also present.
PS>$AWSHistory.LastCommand.Responses.Last.CommonPrefixes
PSDropbox/src/Private/
PSDropbox/src/Public/