Alfresco Version Pruning Behaviour....Mmmm Prunes

This is the first post of a multi-part series for an umbrella project I might be calling "alfresco-content-control." Its a working title. In this particular post we'll demonstrate how you can limit your version history and prune the unneeded versions. I racked my brain trying to think of a cooler theme for this post, but best I can come up with is....the elderly and their love for prunes. Mainly for the digestive benefits...and possibly for the taste. Neither of which are related to enterprise content management...but we'll just go with it.

I should say that this problem has already been tackled before by one of my fellow Alfresco colleagues Jared Ottley and he also blogged about it in his Max Version Policy post. His project on GitHub was an amazing foundation and actually provided a bulk of the logic needed for a prospective customer of mine. I just needed to turn the hearing aid up to 11 to cover some of the following requirements that were missing....

  • Expose the max version property in Alfresco Share through content model properties.
  • Provide the ability to dynamically apply the version pruning behavior to individual pieces of content.
  • Provide the ability keep the root version if needed. This however can be a slippery slope since you may want to mark specific versions as ones to keep permanently. We're not going to tackle that. At least not for now.

So with our list of enhanced requirements, I had enough information to take the project to the next level. Boom! Crack open a can of PBR because we're halfway there!

The first thing we had to do was to create a super simple content model with an aspect or property group called "Version Prunable." Essentially our goal was to move the properties that were initially stored in a file on the classpath and elevate them into content model. That included the existing property for maximum version count and an additional boolean property to provide the flexibility to keep the root version or not. By moving these into a content model, we can easily expose the properties and their associated values in Share.

version-pruning-model.xml:

<?xml version="1.0" encoding="UTF-8"?>
<model name="prune:versionPruningModel" xmlns="http://www.alfresco.org/model/dictionary/1.0">
    <description>Version Pruning Content Model</description>
    <author>Kyle Adams</author>
    <version>1.0</version>

    <imports>
        <import uri="http://www.alfresco.org/model/dictionary/1.0" prefix="d" />
        <import uri="http://www.alfresco.org/model/content/1.0" prefix="cm" />
    </imports>

    <namespaces>
        <namespace uri="http://www.alfresco.org/model/extension/version-pruning/1.0" prefix="prune" />
    </namespaces>

    <aspects>
        <!-- Version Prunable Aspect -->
        <aspect name="prune:versionPrunable">
            <title>Version Prunable</title>
            <properties>
                <property name="prune:maxVersionCount">
                    <title>Max Version Count</title>
                    <description>Max Version Count</description>
                    <type>d:int</type>
                    <default>-1</default>
                </property>
                <property name="prune:keepRootVersion">
                    <title>Keep Root Version</title>
                    <description>Keep Root Version</description>
                    <type>d:boolean</type>
                </property>
            </properties>
        </aspect>
    </aspects>
</model>

Then we had to make some small modifications to the behavior (or behaviour if you think thats more proper). I feel like I should be drinking my PBR with my pinky turned up when saying behaviouuuuur.

Anyways, this included adding logic to pull in property values from a given content node and an if statement to evaluate if the Version Prunable aspect has been applied to the node. Another minor additional was the if/then/else block to evaluate whether we need to delete the root version or the successor of the root version within the VersionHistory

Here's the VersionPruningBehaviour.java implementation (BTW Squarespace has horrible syntax highlighting support for Java code blocks...sorry?)

package org.alfresco.extension.version.pruning.behaviour;

import org.alfresco.extension.version.pruning.model.VersionPruningContentModel;
import org.alfresco.model.ContentModel;
import org.alfresco.repo.policy.Behaviour;
import org.alfresco.repo.policy.JavaBehaviour;
import org.alfresco.repo.policy.PolicyComponent;
import org.alfresco.repo.version.VersionServicePolicies;
import org.alfresco.service.ServiceRegistry;
import org.alfresco.service.cmr.repository.NodeRef;
import org.alfresco.service.cmr.repository.NodeService;
import org.alfresco.service.cmr.version.Version;
import org.alfresco.service.cmr.version.VersionHistory;
import org.alfresco.service.cmr.version.VersionService;
import org.apache.commons.logging.Log;
import org.apache.commons.logging.LogFactory;

import java.util.Collection;

/**
 * Created by kadams on 7/14/15.
 */
public class VersionPruningBehaviour implements VersionServicePolicies.AfterCreateVersionPolicy {
    private static final Log logger = LogFactory.getLog(VersionPruningBehaviour.class);

    private ServiceRegistry serviceRegistry;
    private PolicyComponent policyComponent;
    private NodeService nodeService;
    private VersionService versionService;

    private int maxVersionCount;
    private boolean keepRootVersion;

    public void init(){
        this.policyComponent.bindClassBehaviour(
                VersionServicePolicies.AfterCreateVersionPolicy.QNAME,
                ContentModel.TYPE_CONTENT,
                new JavaBehaviour(this, "afterCreateVersion", Behaviour.NotificationFrequency.TRANSACTION_COMMIT));

        this.nodeService = this.serviceRegistry.getNodeService();
        this.versionService = this.serviceRegistry.getVersionService();

    }

    @Override
    public void afterCreateVersion(NodeRef versionedNodeRef, Version version) {

        try {
            if(this.nodeService.hasAspect(versionedNodeRef, VersionPruningContentModel.ASPECT_VERSION_PRUNABLE)) {
                VersionHistory versionHistory = this.versionService.getVersionHistory(versionedNodeRef);
                if(versionHistory != null){
                    this.keepRootVersion = (boolean) this.nodeService.getProperty(versionedNodeRef, VersionPruningContentModel.PROP_KEEP_ROOT_VERSION);
                    this.maxVersionCount = (int) this.nodeService.getProperty(versionedNodeRef, VersionPruningContentModel.PROP_MAX_VERSION_COUNT);

                    if(maxVersionCount > 0){
                        while(versionHistory.getAllVersions().size() > maxVersionCount){
                            Version versionToBeDeleted = null;
                            if(keepRootVersion) {
                                 versionToBeDeleted = versionHistory.getSuccessors(versionHistory.getRootVersion()).iterator().next();
                            }
                            else{
                                versionToBeDeleted = versionHistory.getRootVersion();
                            }

                            if(logger.isDebugEnabled()){
                                logger.debug("Max Version Count: " + maxVersionCount);
                                logger.debug("Keep Root Version? " + keepRootVersion);
                                logger.debug("Current version history collection size: " + versionHistory.getAllVersions().size());
                                logger.debug("Preparing to remove version: " + versionToBeDeleted.getVersionLabel() + " type: " + versionToBeDeleted.getVersionType());
                            }
                            this.versionService.deleteVersion(versionedNodeRef, versionToBeDeleted);
                            versionHistory = this.versionService.getVersionHistory(versionedNodeRef);
                        }
                    }
                }
                else{
                    if(logger.isDebugEnabled()){
                        logger.debug("No version history found!");
                    }
                }
            }
        } catch (Exception e) {
            e.printStackTrace();
        }
    }

    public void setServiceRegistry(ServiceRegistry serviceRegistry) {
        this.serviceRegistry = serviceRegistry;
    }
    public void setPolicyComponent(PolicyComponent policyComponent) {
        this.policyComponent = policyComponent;
    }
}

ROCK! Now we have everything in place to keep our VersionHistory from getting out of hand. I should however provide an alternative. The latest version of Alfresco's DoD 5015.2-certified Records Management module supports the ability to retain and subsequently destroy individual versions of a work-in-process document. This module is a far better approach for content that is regulatory in nature. Now if you're not managing regulatory content in Alfresco, the RM module might possibly be more horse power than you need. So let's take a quick tour of the version pruning behaviour functionality. Demo........ENGAGE!

So after leveraging a good base from Jared Ottley's Max Version Policy project, we were able to expose version pruning behaviour configuration pretty easily in Share. You can access the source for the alfresco-content-control umbrella project here on GitHub.

- Keep Calm and Grandpa On

Oldie but a Goldie: XPath Metadata Extraction

It's been quite some time since I've written an Alfresco blog post, but I finally decided to commit to being more active in the Alfresco community through blogs and other activities. Since I'm dusting off the old blog and starting anew, I thought it would be fitting to dust off an old Alfresco feature that can still prove to be useful. 

When thinking about a theme for the post, the first thing that came to mind was...well...hipsters...

 

Hipsters often revitalize old trends and from the picture above we find that some trends are better than others. As a Principal Solutions Engineer, I've found one vintage Alfresco feature that has proven to be useful in more than one of my pre-sales opportunities with prospective customers. If we harken back to the Alfresco 2.x days, the now defunct WCM AVM product had a cool feature to perform XPath Metadata Extraction. Essentially this was used to map XML content to AVM Web Forms that would later be used in a presentation layer of your choice. While AVM has been laid to rest, the XPathMetadataExtracter class lives on as a core repository capability. 

Now does this vintage and possibly hipster Alfresco feature deserve to be brought back into the mainstream? I say, absolutely! This is actually a very common pattern being used by lots of organizations for processing anything from financial statements to technical publications (DITA).

So lets actually dive in with an example in which we'll manage hipster or indie rock artist XML content within Alfresco. DISCLAIMER: I actually have very hipster taste in music...so don't judge too harshly ;)

The first thing we'll need is a hipster content model to manage our artists. In the model below, you'll see that its a pretty simple model with text and date properties. A few of which are multi-value properties. 

hipster-model.xml:

<?xml version="1.0" encoding="UTF-8"?>
<model name="hip:hipsterModel" xmlns="http://www.alfresco.org/model/dictionary/1.0">
  
  <description>Hipster Content Model</description>
  <author>Kyle Adams</author>
  <version>1.0</version>
  
  <imports>
    <import uri="http://www.alfresco.org/model/dictionary/1.0" prefix="d" />
    <import uri="http://www.alfresco.org/model/content/1.0" prefix="cm" />
  </imports>
  
  <namespaces>
    <namespace uri="http://www.massnerder.io/model/1.0" prefix="hip" />
  </namespaces>
  <constraints>
    <!-- Indie Genre Constant -->
    <constraint name="hip:genreConst" type="LIST">
      <parameter name="allowedValues">
        <list>
          <value>Emo</value>
          <value>Garage Rock</value>
          <value>Hardcore</value>
          <value>Indie Americana</value>
          <value>Indie Doo-Wop</value>
          <value>Indie Folk</value>
          <value>Indie Pop</value>
          <value>Indietronica</value>
          <value>Lo-fi</value>
          <value>Nu-hula</value>
          <value>Pop Punk</value>
          <value>Post Hardcore</value>
          <value>Surf Rock</value>
        </list>
      </parameter>
    </constraint>
  </constraints>
  
  <!-- Content Types -->
  <types>
    <!-- Artist Type -->
    <type name="hip:artist">
      <title>Artist</title>
      <parent>cm:content</parent>
      <properties>
        <property name="hip:artistName">
          <title>Artist Name</title>
          <type>d:text</type>
          <index enabled="true">
            <atomic>true</atomic>
            <stored>false</stored>
            <tokenised>false</tokenised>
          </index>
        </property>
        <property name="hip:label">
          <title>Label</title>
          <type>d:text</type>
          <index enabled="true">
            <atomic>true</atomic>
            <stored>false</stored>
            <tokenised>false</tokenised>
          </index>
        </property>
        <property name="hip:origin">
          <title>Origin</title>
          <type>d:text</type>
          <index enabled="true">
            <atomic>true</atomic>
            <stored>false</stored>
            <tokenised>false</tokenised>
          </index>
        </property>
        <property name="hip:genres">
          <title>Genres</title>
          <type>d:text</type>
          <multiple>true</multiple>
          <index enabled="true">
            <atomic>true</atomic>
            <stored>false</stored>
            <tokenised>false</tokenised>
          </index>
          <constraints>
            <constraint ref="hip:genreConst"/>
          </constraints>
        </property>
        <property name="hip:members">
          <title>Members</title>
          <type>d:text</type>
          <multiple>true</multiple>
          <index enabled="true">
            <atomic>true</atomic>
            <stored>false</stored>
            <tokenised>false</tokenised>
          </index>
        </property>
        <property name="hip:formed">
          <title>Date Formed</title>
          <type>d:date</type>
        </property>
        <property name="hip:disbanded">
          <title>Date Disbanded</title>
          <type>d:date</type>
        </property>
      </properties>
    </type>
  </types>
</model>

For now we'll assume that you know that you'll have to bootstrap the content model XML using a Spring context file and we'll still to the important files. Next up, we a Spring context file to bootstrap our XPathMetadata Extraction configuration. The most important part of this Spring context file is where we bootstrap the hipster-model-mappings.properties file and the hipster-model-xpath-mappings.properties file in the extracter.xml.HipsterModelMetadataExtracter bean definition. 

hipster-xml-metadata-extraction-context.xml:

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE beans PUBLIC '-//SPRING//DTD BEAN//EN' 'http://www.springframework.org/dtd/spring-beans.dtd'>

<!-- Configurations for XmlMetadataExtracters -->
<beans>
   <!-- An extractor that operates on Alfresco models -->
   <bean id="extracter.xml.HipsterModelMetadataExtracter"
         class="org.alfresco.repo.content.metadata.xml.XPathMetadataExtracter"
         parent="baseMetadataExtracter"
         init-method="init" >
      <property name="mappingProperties">
         <bean class="org.springframework.beans.factory.config.PropertiesFactoryBean">
            <property name="location">
               <value>classpath:alfresco/module/massnerder-blog-xpath-metadata-extraction/metadata/extraction/hipster-model-mappings.properties</value>
            </property>
         </bean>
      </property>
      <property name="xpathMappingProperties">
         <bean class="org.springframework.beans.factory.config.PropertiesFactoryBean">
            <property name="location">
               <value>classpath:alfresco/module/massnerder-blog-xpath-metadata-extraction/metadata/extraction/hipster-model-xpath-mappings.properties</value>
            </property>
         </bean>
      </property>
   </bean>

   <!-- A selector that executes XPath statements -->
   <bean
         id="extracter.xml.selector.HipsterXPathSelector"
         class="org.alfresco.repo.content.selector.XPathContentWorkerSelector"
         init-method="init">
      <property name="workers">
         <map>
            <entry key="/*">
               <ref bean="extracter.xml.HipsterModelMetadataExtracter" />
            </entry>
         </map>
      </property>
   </bean>

   <!-- The wrapper XML metadata extracter -->
   <bean
         id="extracter.xml.HipsterXMLMetadataExtracter"
         class="org.alfresco.repo.content.metadata.xml.XmlMetadataExtracter"
         parent="baseMetadataExtracter">
      <property name="overwritePolicy">
         <value>EAGER</value>
      </property>
      <property name="selectors">
         <list>
            <ref bean="extracter.xml.selector.HipsterXPathSelector" />
         </list>
      </property>
   </bean>
</beans>

Let's have a closer look at the hipster-model-mappings.properties. The main purpose of this file is to tell the XPathMetadataExtracter class which content model and which metadata properties we will be using during the extraction.

 hipster-model-mappings.properties:

# Namespaces
namespace.prefix.hip=http://www.massnerder.io/model/1.0

# Mappings
artistName=hip:artistName
label=hip:label
origin=hip:origin
genres=hip:genres
members=hip:members
formed=hip:formed
disbanded=hip:disbanded

The hipster-model-xpath-mappings.properties is where all the magic happens. This properties file will use metadata property names defined in the previous hipster-model-mappings.properties properties file and will use a corresponding XPath expression to extract values in XML files we'll upload afterwards. 

hipster-model-xpath-mappings.properties

# Hipster Property XPath Mappings
artistName=/artist/@name
label=/artist/label
origin=/artist/origin
genres=/artist/genres/genre/text()
members=/artist/members/member/text()
formed=/artist/formed
disbanded=/artist/disbanded

And lastly we have a sample artist XML file with the following content that we'll upload to a Share collaboration site.

shakey-graves.xml

<?xml version="1.0" encoding="UTF-8"?>
<artist name="Shakey Graves">
    <label>Indepedent</label>
    <origin>Austin, Texas, USA</origin>
    <genres>
        <genre>Indie Americana</genre>
    </genres>
    <members>
        <member>Alejandro Rose-Garcia</member>
    </members>
    <formed>2007</formed>
    <disbanded></disbanded>
</artist>

Then we upload the shakey-graves.xml file to our Discography collaboration site and specialize it to the Artist content type, we have the following results:

METADATA EXTRACTION RESULTS IN ALFRSCO SHARE

 

Check out this video to see XPath Metadata Extraction working in real-time...

 

In true hipster fashion, we've taken an old feature and brought it back to life. Hipsters and a hipster trends get a bad wrap, but I think XPath Metadata Extraction has proven to be pretty useful in my experience. So get out there and grow an ironic mustache, make your own clothing, and pickle things that should never be pickled! When you've got all your hipster gear in check, grab the source from this post on GitHub here. 

Also come join us at Alfresco Day in San Francisco on August 4th, 2015. The registration page and event details can be found here!

Keep Calm and Hipster On!!!

Mass Nerder

This is my inaugural blog post! Woo hoo! Alright!

I suppose that Mass Nerder   requires a bit of an explanation. I promise its not murder-y. Its actually a song title from one of my favorite punk rock bands, The Descendents.    If you already know who they are, you've earned extra cool points in my book. If not, dive into the wonderful world of endless links known as Wikipedia....

Most of the entries in this section will be technology and Alfresco related, but just this once I'll break the rules. 

- Keep Calm and Nerd On