AWS: S3

S3 is the main service used in AWS for storage purposes. The storage is object based used for static type files. They can be text, excel, powerpoint type of files. It is not a lock-based storage that can be used for OS, DBs, and applications. A single file can be from 0 bytes to 5 TB in size. The total aggregate storage is unlimited. Files are stored in S3 buckets.

The buckets are like the local file system folders but are based in Internet space have unique names across the AWS system. The will have DNS addresses in the following format: https://<region>.amazonaws.com/<bucket-name>. when files are uploaded successfully to buckets you will receive a http code 200, but you will only see it programmatically.

S3 data consistency is read after write for new objects. This means new object will be immediately readable after a new PUT. The second data consistency is eventual consistency for overwrites and deletes. This means an updated or overwritten file may have a delay with the update for existing files. This includes delete of existing objects. The delay can occur over other availability zones.

Each object in the bucket is stored using the key-value store.

  • Key: name of object
  • Value: the actual data of object
  • Version ID: used for versioning of object
  • Metadata: the data about the data
  • Subresources:
    • Access control lists: individual permissions of file
    • Torrent: information for torrent of file

S3 is build of 99.99% availability with a guarantee of 99.9% availability. This is different than the 11-9s for durability. The durability is keeping of files, and not losing those files.

Tied storage models:

  • S3 Standard
  • S3-IA (infrequent access): data accessed less, rapid access when needed, lower cost, but retrieval fee. Multiple zones
  • S3 One Zone IA: similar to S3-IA but only uses one availability zone
  • Glacier: cheapest archival data. With retrieval modes:
    • Expedited: fastest few mins
    • Standard: 3-5 hours retrieval
    • Bulk: 5-12 hours retrieval

S3 also has transfer acceleration leveraging Cloud Front (CDN) from S3 buckets.

Encryption can be used on objects:

  • Client side
  • Server side:
    • Amazon S3 managed keys (SSE-S3)
    • KMS (SSE-KMS)
    • Customer provided keys (SSE-C)

Bucket access control can be achieved through Access Control Lists and Bucket policies. By default, buckets, and objects, are private. The objects in buckets do not inherit tags from bucket level.

AWS: IAM

The basic service that AWS uses for user authentication for centralized control, shared access, granular service permissions, Identity Federation, Multifactor authentication, providing temporary access, and password policies are IAM.

There are specific units within IAM that work together to provide all the features within IAM.

The users are as implied the end users for the AWS service. The first end user is the root user. The root user is provided with complete admin access. All other users after the root user are by default set with no permissions, access key ID, and secret access. The access key ID and secret access are viewable only once upon creation and are used to access AWS resources/services programmatically. They do not provide console access. The standard ID and password provide console access and are separate from the access key and secret access. best practices dictate that the root account should always be set up with multi-factor authentication.

The group is a set of users that can inherent permissions given to the group. A role is a set of policies used to giver access to services within AWS. Typically this is EC2 services. Policies are JSON formatted documents that use the key-pair values to define permissions for users, groups, and roles.