Lecture 11 Scribe Notes

June 1, 2010

Scribe: Alen Zamanyan

How do you access resources?

direct

  • application has a pointer to OS object
  • access check done when pointer is given to application (often via VM)
  • + fast access after check
    - resource can be more easily corrupted (if hardware checking is valid, we're okay; otherwise can be problematic - eg application can write anything to screen buffer)

    indirect

  • application gets an opaque handle (eg file descriptor)
  • - slower access: system call overhead
    + fine-grained access control (in software - eg revoke access to file descriptor; no need to rely on hardware for this)

    Most operating systems use one of these two methods (depends on whether we want speed or flexibility)

    Access Control Goals

    Access Control Goals
    Nongoal: denial of service
  • defense mechanism needed to prevent this (usually not part of access control; done at a different level)
  • A good access control method accomplishes all of this

    To design:

  • threat analysis
  • security model
  • Example:
    network tech is updating network software for republican senator's office
    we look over his shoulder, get password and access the files/emails of the democrats

    We need a way of keeping track of what accesses are allowed
    What about the access control data itself (should not be tampered with)?

  • must be udpatable (sensitive operation)
  • controlled operation needed
  • indirect access, due to sensitivity (it's rare that hardware will provide a mechanism for access control according to our specific needs)
  • There are two main ways to do this:
  • access control lists (ACLs) - list of people who can access each object (eg a guard that checks ID)
  • capabilities - no centralized access control list; distribute a key to each user with access (eg we need a key to open a lock)
  • In both cases
  • must be unforgeable (at least part must live "in" operating system)
  • must be consulted before access (no way to bypass)
  • hardware and/or operating system support needed
  • Tyring to represent a 3D space

  • 3D array of booleans - Can this principal access this object for this operation?
  • 3D Array

  • if prinicipals = 1e4
  • objects = 1e6
  • operations = 1e2,
  • we need principals * objects * operations = 1e12 bits to store AC info!?!
  • this is too much metadata

  • On the other hand...

    Unix Permissions Model

    Each object has 9 permission bits
    ls -l output
    rwxrwxrwx ... <-------- 9 bits
    we need 32 bits to store the owner and another 32 bits to store the group, so 64 + 9 = 73 bits/object
    with 1e6 objects, we only need 73e6 bits of metadata
    + much more compact!
    + easy to check
    - only 3 operations (read, write, exec)

    Initially in Unix, a single process could only have one group; the Berkeley folks changed it so a single process can have several (limit is relatively small - up to 8) groups simultaneously
    1) this is a bit too generous in many cases
    2) sysadmin (root) is in charge of group membership - inflexible
    3) hard to maintain - lots of people, lots of roles
    Also, say someone works for payroll and billing but shouldn't be able to mix the two up - there is no support for this

    We can give users different userids for each set of objects they can access - but the user would have to log out of billing account and log back in to access payroll

    ACLs (widely introduced in Windows NT)

  • now in Linux, Solaris
  • complicate representation of permissions - associated with each file (& operation) is a list of users (+groups) that can access it
  • eg - for a given object, we have a list of users & associated operations and another list of groups & associated operations
    $ getfacl object <--------- get file access control list
    user:: rwx
    group:: r-x
    adnan:: rwx
    other:: ---
    eg -
    /u/class/spring10/cs111 <---------- has access control list that allows 2 TAs to read/write
  • There is also a setfacl call that allows us to set the file access control list for a given object.
  • This getfacl, setfacl pair of calls takes care of #2 above (eg adds flexibility)
  • Role-based Access Control (RBAC)

    used in Solaris, ActiveDirectory

    Users can assume roles

  • rights are associated with roles not with users
  • if you assume a role (employee, instructor-in-charge, student), you may lose rights of previous role
  • you can have >1 sessions with different roles

  • this is more appropriate for a large organization where roles are clearly defined
  • it often comes with fine-grained control over operations
  • eg - normally, calling unlink(d), where d is a directory, is not allowed (why is this? - answer below); however, role-based AC can allow users to unlink a directory

    Why can't we normally unlink a directory?
    Unlinking a directory
  • unlink("bin") will result in linkcount--
  • in this case, linkcount goes from 2 to 1, but we have lost the pointer to this directory and have not reclaimed storage (leak)

  • eg - (another exmaple of fine-grained control in RBAC) - linking to a file that you don't own is normally allowed, but can be disallowed in role-based access.

    Say you want to do the following:
    link("/user/adnan/a", "b")

    Why might we need to disallow this?
  • If user adnan granted permissions to '/user/adnan/a', then wants to revoke, I may already have a hard link (namely 'b') to the file - if he initially disallowed linking (as can be done in RBAC), this problem would be solved

  • A hacker does the following:
    $ ls -l /bin/passwd
    -rwsr-xr-x root /bin/passwd
    $ ln /bin/passwd $HOME/p

    Suppose a bug is found in the passwd program (the hacker knows about this bug)
    When we try to fix the security hole, we do the following
    # cat fix > /bin/password.new <----------- create new file
    # chmod 4755 /bin/password.new
    # mv /bin/password.new /bin/passwd

    The bug doesn't get fixed because link count was 2 (instead of the expected 1) - /bin/passwd doesn't get deleted and replaced with the fix; as a result, the hacker is able to exploit the bug

    Capabilities

    Possession of this 'word' implies rights to an object ('word' may be 64- or 128-bits)
    capabilities word

  • Grant permission to a process by passing it a key
  • But the process can email it to anyone else, potentially granted access to users who weren't meant to have it!

  • How to implement

    1) encryption - capabilities sent across network
  • in effect, process1 does 'fd = open();' and sends fd to process 2
  • 2) index into operating system table (eg file descriptor)
  • Can we forge a file descriptor? - No, we must use 'open', which uses an access control method internally
  • Issues
    1) 'words' must be wide enough so they can't be guessed (at least 64-bits)
    2) containment - the capability can escape! (eg if you log it somewhere)

    Capabilities are lightweight, easy to understand - as a result, they are popular in academia

    Denial of Service Attacks

    defense methods

  • What if 1e6 attackers visit whitehouse.gov/feedback, add comment: 'Prez is dope!' - how can we defend against this?
  • use a captcha - can't be read by robot, but can be read by human
  • Captcha
  • log IP address, keep track of (recent) bad guys
  • most botnets have IP address hard-wired for performance, can't afford DNS lookup - so change the IP address
  • make your web server faster (enough capacity that DoS isn't really effective)

  • how do we make Apache faster?
    for (;;) {
    	fd = accept();
    	read(fd);   <-------- this hangs if there are no bytes!
    	handle request;
    	close(fd);
    }
    

    Approach 1 - fork():

  • What if we try to fork off a child process to do the reading of bytes across the stream and handling of requests?
  • this is slow because Apache is a big process; needs to be copied
  • use vfork (share memory) instead
  • Approach 2 - go multithreaded:

  • all threads must share memory
  • it's a pain to maintain scripts from different authors
  • - bug in one handler can corrupt the whole system
    + faster

    Approach 3 - preforked children:

  • Apache calls 20 forks upon startup
  • child handles request, but already exists
  • if it dies, fork off another one
  • Approach 4 - Event-based (fastest web servers):

    + nonblocking I/O
    + multithreaded (1/CPU) - threads never wait

    Ultimately, this is an example of an instance where an application (namely, the web server) does its own scheduling - because the Linux scheduler sucks