CS111 Lecture 19 Scribe Notes

CS111 Scribe Notes

Cloud Computing and Security Continued

Collin Lambert and Girish Nanjundiah

Note To Class:
You should all read the cloud computing paper , it will be on the final.

Real Computing

Mainframes(1960s) are large computers that consist of peripheral processing units (PPUs) connected a bus and disk data. The bus also allows communication with the CPU. They are usually used by large organizations for data processing. Some of the big names in mainframes are IBM and Fujitsu. Their advantages are:

Data-Intensive computing, were originally used for lots of calculations
Data-Optimization: You can always stream data from a disk with peripheral processing units.
Reliability: Mainframes always run, you don't have to worry about failures

Mainframe

Clusters(1990s) are a group of linked computers working closely enough that they seem to resemble a single unit. However, the individual computers know that they are linked by a bus unless they are running on a single-image cluster (see below). Some of the big names in clusters are Beowulf and SGE(Sun Grid Empire). Some of their advantages are:

They can compute with better performance than single computers in many cases
They can be more cost-effective than single computers.
Single-image cluster: A cluster where the applications "think" they are all on the same machine. It doesn't work because it is difficult to optimize and isntall.

Clusters

Clouds

Clouds can be thought of as "clusters of clusters." They are an abstraction used to represent a large infrastructure in which users don't always need to know the underlying details and mechanisms of what goes on underneath. Some of the big names are Amazon, EC2, and Globus. Clouds have numerous advantages such as reduced cost in many cases. However, there are many potential problems that need to be addressed as well:

Political Problems -> Technical Problems:
Who controls the cloud? A cloud requires multiple groups (i.e. Stanford, MIT, CalTech, etc.) but there must be one group in charge of the cloud. This maps to a security issue, who controls access of users?
Who pays for the cloud? A cloud is not free, it requires plenty of hardware and man hours to keep running. This maps to resource management and can in turn affect the performance of the cloud.

Cloud advantages:

Short-term commitment (capitol investment savings).
You pay as you build your cloud.
The cloud can grow quickly as needed with fast scaling.

Mainframe

Cloud disadvantages:

Price($): If the amount of computing is stable, it is better to use clusters.
Privacy: Data confidentiality (can be solved with encryption).
Network latency.
Data transfer is a bottleneck
- Archiving: Different storage sizes and speeds.
Bugs: Difficult to fix as you scale the cloud.
- Unsolved Problem (if by solving you mean cheaply).
Security (besides confidentiality): such as denial of service and physical attacks
Overload Risks
- Multiple suppliers.
- There is a possibility of exceeding cloud capacity.

Mainframe

Other problems include vendor lock-in and software licensing
-Vendor lock-in is where you are forced to go with just one company's product. An example would be a user going with Microsoft to satisfy needs in one area and eventually having to move everything else to Microsoft, too.
-Software licensing involves cases where software is not free so placing multiple copies of the same piece of software will cost much more.

More Security

Groups
  Traditionally Unix was set up so that the user could only belong to one group. This can cause many issues for users. For example, say Professor Eggert belonged to the Professor group, which meant he had access to all professor files. But say he also wanted access to the T.A. files (which belonged to members of the T.A. group), he'd have a problem.
  Another issue was creating a group. Traditionally a group could only be created by the root, limiting the control of an average user. If Professor Eggert was interested in creating a CS111 group, he would have to contact the individual in charge of the department servers.
  With this form of security, the owner of a file could only allow access to:

Anyone on the system
The group to which you belong to
Or only yourself

These properties were stored in the form rwxrwxrwx. The first three variables are the read, write and executable permissions for the owner of the file. The next three are for the group, and the last three are for other.

ACL's (Access Control Lists)
Access Control Lists are lists of all the people who have read, write, and/or executable rights to a given file. Every file has its own ACL which can be viewed with the getfacl [-dRLPvh] filename command and edited with the setfacl [-bkndRLPvh] [{-m|-x} acl_spec] [{-M|-X} acl_file] filename command. This allows multiple groups to have access to the same file, in the previous example allowing Professor Eggert to access both the professor group's files and the T.A. group's files.

$getfacl
user::rwx
user:Frank:rwx
group::--x
mask::r-x
other::rwx

A key idea to always remember is to make sure that your default ACL's are right when a resource is created. If you ensure that, less has to be changed for every file and the security of the file is stronger. For example, you don't want your default ACL to be read, write, and executable for all users.

For more Linux ACL information please refer to the following site: Posix Access Control Lists on Linux

RBAC (Role-Based Access Control)
The idea of Role Based Access Control is that every user and application can assume one or more roles. A role defines the sort of access the user or application is allowed. Some example roles could be:

Backup (given permission to read all files but not write to them)
Poweroff
Change Grades (permission to read and write to a limited number of files

There are two mechanisms for enforcing access control:

ACL's, etc: Each resource has an ACL attached to it that holds all permissions. This method requires the operating system to act as the mediator (through syscalls.) The operating system will check whether the principal has rights based on the resource's ACL. This method is unforgeable since everything goes through the operating system.
Capabilities: Each principal has a set of capabilities in the form of hased pointers to the resource. Each hash pointer is unique in the respect that it holds the capabilities of the principal. This does not require OS mediation and in some respects is much simpler that ACL's. Unfortunately, if another principal obtains a forged hash pointer it may gain unwarranted access to a resource since the access is not checked by the OS.

Trusted Software
From an Operating Systems viewpoint, Operating Systems do not trust applications because they don't trust users and applications are controlled by users. They also run on behalf of users. However, some programs are trusted, so which software should we trust? As few as possible.

Examples:

Can we trust users to use programs like this?
```
setuid(10976)
```
Only if its usage is restricted to the root.
Can we trust the login program?

How can we trust login?
By using cryptographic checksums of programs.

How can the vendor trust login?
The paper Reflections on Trusting Trust by K Thompson reveals that we cannot always trust even login. For example, it could be possible to distribute a buggy version of login.c (makes a certain user root, etc.) that looks safe in the C file but is buggy in the object file. And someone could make a version of gcc that would only compile the buggy version of login.o! As you can see, you can't always trust even the most necessary programs too easily.