#persist

====DRAFT====

Introduction
=========
Authorization is a critical function for DataONE nodes.  Improper protection of data can lead to for example, premature publication (leaks) of datasets, data corruption, data inconsistency, and if common enough, eventual lowered trust of the system.  Detecting access problems for a given objects would be very difficult to detect after the fact, so rigorous up-front testing of a node's authorization sub-system is important.

For testing purposes, Member node implementation of client authorization can only be considered a black-box, so the design of authorization testing cannot make many, if  any, inferences to simplify the testing process.  However, the number of authorization scenarios while not small, is tractable, and a systematic test setup, in terms of test objects and subjects, will minimize the overhead in testing and possibly provide a way to spot check existing nodes in production (regression testing).

Also, not all nodes allow creation of test data at test time, so the testing strategy relies on pre-population of appropriately configured test objects prior to testing by the node deployers / admins to test those nodes.


Dimensions of testing
--------------------------------
Clients are authorized for a given permission based on 6 interrelated concepts/mechanisms:
1) the accessPolicy of the object
2) the rightsHolder field of the object
3) use of symbolic principles
4) the identity of the user
5) use of a subject's subjectInfo (its mapped-identities & groups)
6). hierarchical permissions: CHANGE_PERMISSION implies WRITE implies READ.

Authorization testing needs to ensure member node implementation properly brings together these concepts to determine whether a client is authorized for a particular action (READ, WRITE, CHANGE_PERMISSION).

Testing strategy
-----------------------
1. While AccessPolicies allow for multiple AccessRules containing multiple Subjects given multiple Permissions, testing will be done against objects containing a single AccessRule containing a single Subject authorized for a single Permission.  This allows direct attibution of authorization to a particular AccessRule and Subject.
Testing of multiple Subjects and Permissions and AccessRules is currently out of scope.

2. Only 2 types of nodes can create test objects at test time, Tier 3 and 4 Member Nodes.  Coordinating Nodes do not publicly expose the create() method, so authorization testing is better accomplished with pre-populated test objects.  Similarly Tier 1 and 2 Member Nodes by definition do not implement the create() method, and similarly need to prepopulate themselves with test objects.

Requirements
 * Tier 1 Member Nodes - need at least one publicly readable object, in order to test the MNRead interface.  This would be an object with READ permission assigned to the symbolic PUBLIC subject.
 * Tier 2 Member Nodes - need 11 test objects with varying permissions assigned to various Subjects (Persons, Groups, symbolic principals)
 * Tier 3 & 4 Member Nodes - no data required.
 * Coordinating Nodes - need the same 11 test objects as Tier 2 Member Nodes, and also setup a few test Subjects (Persons and Groups) and map identities and group membership.
 * All nodes under test will need to accept certificates signed by the DataONE CA.

3. It is expected that Member Nodes will interrogate the subjectInfo passed to it in the client certificate, and Coordinating Nodes will instead refer to their internal IdentityManager to look up the subjectInfo.  The tests will be the same, but the starting requirements will be different.  In particular, tests of authoriation involving groups and mapped identities will not work if the CN's IdentityManager does not contain the expected relationships between the test subjects.


Testing Design
==============

Test Objects for Coordinating Nodes and Tier 2 Member Nodes
==============================================
Within the "single-access-rule-single-subject-single-permission" AccessPolices, there are 11 useful* combinations to test:

    +---AccessRule---+  
SubjectLabel     Permission   RightsHolder       TestObjectPid  
------------------------------------------------------------------
null             null         testPerson         TierTesting:testObject:RightsHolder_testPerson
null             null         testGroup          TierTesting:testObject:RightsHolder_testGroup

testPerson       READ         testRightsHolder   TierTesting:testObject:testPerson_READ
testPerson       WRITE        testRightsHolder   TierTesting:testObject:testPerson_WRITE
testPerson       CHANGE_PERM  testRightsHolder   TierTesting:testObject:testPerson_CHANGE

testGroup        READ         testRightsHolder   TierTesting:testObject:testGroup_READ
testGroup        WRITE        testRightsHolder   TierTesting:testObject:testGroup_WRITE
testGroup        CHANGE_PERM  testRightsHolder   TierTesting:testObject:testGroup_CHANGE

public           READ         testRightsHolder   TierTesting:testObject:Public_READ
authenticatedUser READ        testRightsHolder   TierTesting:testObject:Authenticated_READ
verifiedUser     READ         testRightsHolder   TierTesting:testObject:Verified_READ


*This is not a full enumeration of scenarios, but it is assumed that symbolic principals will not be assigned WRITE or CHANGE_PERMISSION permissions in access policies or placed in the rightsHolder property, and would be derivative tests at that point.

Test Subjects used by the integration tests
===============================
Object Subjects
------------------------
testSubmitter
testRightsHolder
testPerson
testGroup


Client Subjects                    MappedIdentites                  Groups              Verified?
----------------------------------------------------------------------------------------------------------------------------------------------------------------
testSubmitter                        none                                  none                   No
testRightsHolder                    none                                 none                    No
testPerson                            testMappedPerson             testGroup            Yes
testMappedPerson                 testPerson                        none                    No
testGroupie                            none                                testGroup             No

test_Person variants
- - - - - - - - - - - - - - - -
testPerson_ExpiredCert          testMappedPerson            testGroup            Yes
testPerson_UntrustedCert       testMappedPerson            testGroup            Yes
"null" user                              n/a                                   n/a                     n/a


Tests
=====
Tier 1 MemberNodes will not test isAuthorized() directly, but will be exercising the MNRead api, and will need to have object identifiers passed in for testing.  MNCore.listObjects() will be used to get these identifiers, and so listObjects() needs to return at least one object.

All others: will run a battery of tests against each of the 11 test Objects, testing either for proper success or failure from isAuthorized().

Each test object will be run against:

ClientSubject             testing Permission
================================
NoRights                    READ
                                 WRITE
                                 CHANGE_PERMISSION
testPerson                 READ
                                 WRITE
                                 CHANGE_PERMISSION
testMappedPerson      READ
                                 WRITE
                                 CHANGE_PERMISSION
**alternate client subjects will probably be needed for testing the symbolic principals.  We need to test the VALIDATED_USER object a bit more extensively with regards to groups and mappedIdentities.


The full enumeration (3 subjects x 3 tested permissions x 11 testObjects) allows for testing of:
1. proper permission level handling (WRITE implying READ)
2. testing access via rightsHolder 
3. testing access via accessPolicies
4. testing access via group membership
5. testing access via mappedIdentities
6. testing access via symbolic principals


It is unclear at this point how extensively to test anonymous clients and the "testPerson_Expired" certificate against the test objects.  The validation of certificates is a separate mechanism in all cases, so the full enumeration of situations is probably not be needed.

Procuring test objects
================
Procuring test objects involves knowing the tier level of a member node, and if using pre-populated test objects, requires knowing how to pick them out amongst other data on the machine.  Interrogating the system metadata of each item returned from listObjects is expensive and doesn't scale.  Use of isAuthorized() to filter the list doesn't work because it obviates the test.  That means that the tester should be able to either use pre-known object identifiers (pids), or choose from a small subset of pids from the returned ObjectList.  The easiest means to accomplish the latter is by using a common prefix for testObject identifiers.  Use of a prefix adds flexibility at runtime, but would introduce overhead in getting the identifiers.

The general procurement algorithm using a prefix:
------------------------------------------------------------------------
if (node.isServiceAvailable("MNStorage"))
     createTestObject(accessRule,rightsHolderSubject);     
else
    getTestObject(pidPrefix, accessRule, rightsHolderSubject);

getTestObject


Using a predetermined pid:
---------------------------------------
sysmeta = node.getSystemMetadata(pid);
if (sysmeta == null && node.isServiceAvailable("MNStorage"))
    pid = createTestObject(pid, accessRule, rightsHolderSubject);
    
return pid


Questions
---------------
1. Does the VERIFIED_USER principal intersect or union with other accessRule subjects?  (can an object be available to subject "Joe Smith" only if "Joe Smith" is verified?)
2. What's the expected behavior when the subject used to connect is not verified, but is mapped to an identity that is verified?
3. What's the difference between "equivalent identity" and "mapped identity"? 
    - under http://mule1.dataone.org/ArchitectureDocs-current/apis/CN_APIs.html?highlight=subject#CNIdentity.getSubjectInfo:  "get the information about a Person (their equivalent identities, and the group to which they belong"
   

Testing of valid and invalid certificates

Invalid certs:
- SubjectInfo that does not validate against the schema.
- SubjectInfo that does not have a Person record which matches that of the certificate Subject DN.
- SubjectInfo that has equivalent identity records which do not themselves have matching Person records.

Valid certs:
- "Missing" SubjectInfo.

More Questions  (Feb 21)
---------------------------------------
1. what exception is thrown when a certificate is not trusted?  NotAuthorized or InvalidToken?
2.