Thursday 10 May 2012

Quick Analytics from SharePoint (kind of...)

SharePoints builtin analytics, interesting as they are, don't always provide you with what you need.

This week I had a request to provide a report that summarised access to a particular pdf document on a SharePoint site (accessed via a URL in an email), broken down by department, location and employee type. At first I thought I'd design a simple SharePoint solution that allowed site admins to upload a file to a library, then track the access to the document via an application page that provided a dynamic graph where they could slide and dice the information.

As cool as that would have been, since I was short on time, and this was a one off request, I decided to use Log Parser and PowerShell to create the report (which took a fraction of the time).

Here's what I did:

1. Copy the IIS log files (for the SharePoint web application in question) from the WFE servers into a single directory on my computer

2. Open PowerShell, change to the directory with the log files, run logparser against the files, and extract the data I want into a CSV file.

Obviously this is where you would customise Log Parser to output the data you're interested in, but the command I ran pulls out the number of times a user has opened the document (DocumentInQuestion.pdf) per day.

Here's the command:

logparser.exe -i:W3C "select cs-username as User, Count(*) as ReadCount, date into report.csv from .\* where cs-uri-stem = '/marketing/documents/DocumentInQuestion.pdf' and User Is Not Null and date > Timestamp('2012-04-01','yyyy-MM-dd') group by user,date" -o:CSV

3. Next, the fun bit, is using PowerShell to bring it all together. What I'm going to do is create an object to store each record in, then add the records to an array, and use the array to create the report.... Oh, and I'm going to use the Active Directory PowerShell module so that I can get some extra data about each user from Active Directory (department, office location and employee type).

First. create the object for storing each record in:

$request = New-Object psobject
$request | Add-Member -MemberType NoteProperty -Name "Name" -value ""
$request | Add-Member -MemberType NoteProperty -Name "Location" -value ""
$request | Add-Member -MemberType NoteProperty -Name "Department" -value ""
$request | Add-Member -MemberType NoteProperty -Name "EmployeeType" -value ""
$request | Add-Member -MemberType NoteProperty -Name "Reads" -value ""
Copy the data from the CSV file to a PowerShell object we can use:

$r  = Import-Csv .\report.csv

$r now contains each line of the CSV file (as a collection of objects), and we can access individual columns by their original name in the CSV file. I.e. $r[0].user

Next, create a new array, loop through each object in $r (aka each line of the imported CSV file), format the 'user' column (I need to trim of the domain name), passing the 'user' string to the Get-AdUser command (returning a user object containing the extra properties (displayName, Office and employeeType) I need), parse the object returned from Active Directory, extracting the contents into a new "request" object and merge it with the details from the current object in $r, and finally add it to the array. Phew... that was a long sentence, but really quite straight forward.

To break it down a little;

1. For each object (aka line of the CSV file) in $r
foreach($i in $r)

2. Getting the "user" property, trimming off the first 6 characters (which is our domain name and forward slash)
$i.User.Substring(6)

3. Pass the trimmed user name to the Get-AdUser commandlet, and request the additional properties, displayName, department, office and employeeType
$u = get-aduser $i.User.Substring(6) -Properties displayName,department,office,employeeType 

4. If the object returned from Active Directory is not empty, then create a new "request" object, and populate it with information from the current object in $r (aka current line in the CSV file), and add it to the array
if($u -ne ""){
$b = $request | Select-Object *; $b.Name = $u.displayName; $b.Location = $u.office; $b.Department = $u.department; $b.EmployeeType = $u.employeeType; $b.Reads = $i.ReadCount; $a += $b;
;}

Here's the full command:

$a = $null
$a = @()
foreach($i in $r){$u="";$u = get-aduser $i.User.Substring(6) -Properties displayName,department,office,employeeType -ErrorAction:SilentlyContinue; if($u -ne ""){
$b = $request | Select-Object *; $b.Name = $u.displayName; $b.Location = $u.office; $b.Department = $u.department; $b.EmployeeType = $u.employeeType; $b.Reads = $i.ReadCount; $a += $b;
;}}

Once we have done this, we have all the information we need in a new array of objects, and it's simply a case of formatting the data the way we want it and creating the report. Because we have an array of "request" objects, this is pretty easy, as we can iterate over them, group them, sort them, etc. I needed to display the following information:

Total amount of people who opened the document
Total amount of people by Office
Total Partners
Total Fee Earners
Total Business Services

My report looked like this:

$nr = @();
$nr += "Usage for: "+$doc
$nr += ""
$nr += "Total Count: "+$a.Count
$nr += ""
$nr += "Count per office:"
$b = $a | group location; foreach($d in $b){$nr += $d.Count.ToString()+"`t"+$d.Name}
$nr += ""
$nr += "Total Partners"
$b = $a | ?{$_.Department -like "Partners"} | measure; foreach($d in $b){$nr += $d.Count.ToString()+"`t"+$d.Name}
$nr += ""
$nr += "Partners / Office"
$b = $a | ?{$_.Department -like "Partners"} | group location; foreach($d in $b){$nr += $d.Count.ToString()+"`t"+$d.Name}
$nr += ""
$nr += "Total Support Staff"
$b = $a | ?{$_.EmployeeType -like "SupportStaff"} | measure; foreach($d in $b){$nr += $d.Count.ToString()+"`t"+$d.Name}
$nr += ""
$nr += "Support Staff / Office"
$b = $a | ?{$_.EmployeeType -like "SupportStaff"} | group location; foreach($d in $b){$nr += $d.Count.ToString()+"`t"+$d.Name}
$nr += ""
$nr += "Total Fee Earning Staff"
$b = $a | ?{$_.EmployeeType -like "FeeEarningStaff"} | measure; foreach($d in $b){$nr += $d.Count.ToString()+"`t"+$d.Name}
$nr += ""
$nr += "Fee Earning Staff / Office"
$b = $a | ?{$_.EmployeeType -like "FeeEarningStaff"} | group location; foreach($d in $b){$nr += $d.Count.ToString()+"`t"+$d.Name}

And when displayed, it looks like this...

Usage for: /marketing/documents/DocumentInQuestion.pdf

Total Count: 477

Count per office:
52      Hong Kong
42      Hamburg
8        Monaco
191    London
12      Le Havre
19      Paris
33      Dubai
36      Greece
29      Singapore
33      Shanghai
20      Singapore Local
2        Beijing

Total Partners
92

Partners / Office
58      London
1        Hamburg
5        Singapore
11      Greece
2        Shanghai
8        Dubai
3        Singapore Local
4        Paris

Total Support Staff
195

Support Staff / Office
23      Hong Kong
35      Hamburg
58      London
12      Le Havre
15      Paris
11      Greece
7        Dubai
15      Singapore Local
5        Shanghai
13      Singapore
1        Beijing

Total Fee Earning Staff
269

Fee Earning Staff / Office
6         Monaco
7         Hamburg
133     London
26       Dubai
29       Hong Kong
15       Greece
21       Singapore
28       Shanghai
4         Paris