CS 111 Spring 2010

Lecture 17: NFS Performance, and Security

May 27, 2010

Russell Matteson
Sean Morris

NFS Performance

spec.org
SPECsfs2008_nfs.v3 NFS version 3
HP BL860c i2 4-node HA-NFS Cluster (April 2010)

4 Blade Servers, each with 192GB memory and dual 146GB 15,000 RPM boot drives
8 Disk Controllers, each with 8GB cache
4 FC Switches
16 FC Disk Arrays
1,472 Disk Drives, each 72GB 15,000 RPM

server diagram

The idea here is that we have two connections to all hardware so we avoid a single point failure.

performance chart

Overall Response Time: 1.68msec
This response time is faster than a local disk would be and is made possible by the enormous caching capacity of the system.
NFS Security Problems Without any security considerations it is necessary to trust all clients with root access. This only works in centrally managed clusters since in other setups most clients have root access and trusting all of them would be unreasonable. For example:

Client A	Client B
`$ sudo sh # cp /bin/mount /nfs/eggert/mount # chown root /nfs/eggert/mount # chmod u+s /nfs/eggert/mount # ls -l -rwsr-xr-x ...`	`$ /nfs/eggert/mount`

Uh-oh... Now possible to see the entire file system.
This problem arises because a user can log in to the server as root (uid 0). The basic solution to this problem is for the server to remap the user id of a client claiming to be root to some other number which it has reserved for "nobody" (for example uid 65535). This remapping of root to nobody solves the immediate problem, but users can still pretend to be other users. Solving this problem requires authentication measures which we will discuss later.
There is also a problem when separate accounts are created on the server for the same user. For example, if the chemistry and physics departments share a file system, and Eggert has accounts for both departments with different user ids then he will not be able to share files with himself and will be sad. To solve this problem the system should use usernames rather than user ids to service requests. This is the approach taken in NFS version 4 which is not as widely supported as version 3, which is faster.

Security

It is very difficult to retrofit a system to be secure once it has been built. Therefore it is important to plan for security from day one.
Real world meaning: Defense against attacks via force and fraud
Virtual world meaning: Defense against attacks via fraud (though this may be changing)

Main Attack Categories

Attack against Privacy - For example, to learn passwords or social security numbers
Attack against Integrity - For example, to change grades or delete files
Attack against Service - For example, to deny access to government websites

Main Solutions

Disallow unauthorized access - accomplished individually by unplugging the system
Allow authorized access - accomplished individually by letting everyone into the system

These goals conflict when taken together.
It is much easier to test the case of allowing authorized access: the good guys will send bug reports while the bad guys will not.

Threat Modeling and Classification

Determining possible threats to the system should be the first step in the design process.

Insider Attack - For example, at Citibank the biggest security risks are employees who already have passwords into the system. These are often the biggest threats but exploitations are rarely reported because they are embarrassing.
Social Engineering Attacks - For example, a student pretending to be a professor over the phone in order to change academic records. Lapses in security here are usually caused by a person on this inside being too trusting of a person on the outside and thus failing to properly authenticate the outsider.
Network Attack - For example, buffer overruns, viruses (through email, drive-by downloads), denial of service (DoS and DDoS) attacks. Attacks such as this one are getting worse with time. The Conficker worm falls into this category.
Device Attacks - For example, by USB or CD. This happened recently to the U.S. military in Afghanistan.

Aside: The Conficker Worm
This worm was first detected in November 2008 and has gone through several variants since that time. It combines many advanced malware techniques which makes it difficult to detect and remove. Perhaps the scariest thing about the worm is that we know what it does but not what it will be used for. After infecting a system the worm connects to a secure peer to peer network and waits for instructions. Older versions of the worm self-update to newer ones, allowing the writers to patch the worm against anti-virus efforts and introduce new means of propagation. The unknown operators behind Conficker provided a demonstration of the worm's capabilities in 2009 when it was activated to download an existing spambot and a scareware application. Since then it has returned to an inert state with over 7 million infected hosts including military networks and hospitals. It is suspected that the operators might be planning to sell the worm or its services to the highest bidder. The maintainers of Conficker are obviously a professional group and have been closely tracking removal efforts and applying their own patches to cover the worm's own possible exploits. Conficker currently only infects Windows-based platforms and Microsoft has responded by forming an industry group dedicated to the eradication of the worm. Currently the company is offering a $250,000 reward for information leading to the arrest of Conficker's authors.

Implementing Security Mechanisms

Authentication - For example, by password
Authorization - For example, permissions on files
Integrity - For example, logs and checksums
Auditing - For example, logs
Correctness - Secure systems must still produce correct results for people to use them
Efficiency - Secure systems must still be usable or people will just not use them

External Authentication

The main goal of authentication is to prevent masquerading

External Authentication - Used to identify outside participants
Internal Authentication - Used for inside participants

Ways to Authenticate

Based on what the principal knows - For example, a password (these can be guessed by or shared with bad guys)
Based on what the principal has - For example, keys or some other token (these can be lost or stolen)
Based on who the principal is - For example, through thumbprints or a retinal scan (thumbprints can be hijacked using a special gel substance)

Password Security

System-supplied Password - the system provides users with a secure password, but these can be difficult to remember and many users will write them down or save them in a file.
System Guess Check - A system will not accept a user-supplied password that it is able to guess.

Password Attacks

Password Guessing - The attacker attempts to guess the password
Network Snoopers - An attacker tries to pickup passwords by listening to network communications
Fraudulent Servers - The attacker uses a fake server that pretends to be a real one and lets users give it their password information

Internal Authentication

The operating system stores who you are in a process descriptor
The descriptor holds a user id (uid), a group id (gid), an effective user id (euid) and an effective group id (egid). This way when Eggert (uid 47) runs a setuid program owned by Gene (uid 93), Gene is able to run any authorization algorithm he likes. The process descriptor is also consulted for all access checks such as read() and kill(). To change the user id, a person can use the system call setuid().

One question that arises is how much we want to trust the login agent. Ideally it would only be able to log people in, but here it has root access and can do anything.

Authentication Building Blocks

Cryptographic Hash Functions

h(k) = value --> This operation should be cheap
h^-1(value) = k --> This operation should be expensive

An example of a cryptographic hash function with these properties is the Secure Hash Algorithm (SHA-1, SHA-2).

Symmetric Encryption (Shared Private Key)

Shared Private Key (K) - known to both sender and recipient
Message (M)

Given	Determining	Should Be
M, K	{M}^K	Easy
{M}^K, K	M	Easy
{M}	M	Hard
M, {M}^K	K	Hard

Asymmetric Encryption (Public Key)

Private Key (K) known only to recipient
Public Key (U) known to everyone
Message (M)

Given	Determining	Should Be
M, U	{M}^U	Easy
{M}^U, K	M	Easy
{M}^U, U	M	Hard
{M}^U, U	K	Hard
U	K	Hard

Hash-Based Message Authentication Code (HMAC)

Shared Secrete Key (K)
Send: sha1((K ^ pad₁) + sha1((K ^ pad₂) + M)) + M
Note: Simply sending sha1(K + M) + M is not sufficient because knowing sha1(x) it is relatively easy to guess sha1(x ^ d).

Multiphase Approach

A sends "{nonce_A + I'm A}^U_B" to B
B sends "{nonce_A + nonce_B + I'm B}^U_A" to A
A sends "{nonce_B + K}^U_B" to B

For the rest of this transaction A and B use the shared private key K for more efficient secure communication.
This approach is taken by the Secure Shell (SSH)
You can find your public key in .ssh/id_rsa.pub
You can find your private key in .ssh/id_rsa
You can find your list of known hosts in .ssh/known_hosts
SSH is usually used as a point to point protocol as it is somewhat awkward for larger networks and has a high setup cost O(n²).

Building a secure virtual network atop an untrusted network:
SSH for informal networks, popular in the academic world
IPsec for formal networks, popular in the corporate world

IPsec

Much more overhead in network infrastructure
Uses internal authorization within a virtual network
Uses external authorization to setup the virtual network

Covert Channel

Answers the following question:
Given process A and process B, is it possible for A and B to communicate if they have no network access and no common server?
Yes: through the idea of covert channel, process A can flood the network with activity, which process B will recognize and interpret as 1. A period of low activity is interpreted by process B as 0. In his way processes that should not be communicating can communicate.