Monday, January 25, 2010

Search PDF files in MOSS


Search is one of the powerful features of MOSS Portal. It allows the user to retrieve data seamlessly showing the right data to the right people. The Search also spreads its wings to all the office suite. Many a times the office documents are stored as PDF files and are uploaded to Portal.
Adobe has also provided filters that will enable searching the PDF files. the below steps describes the process to index within PDF files
Getting Adobe IFilter 9 to work with SharePoint
Download the Adobe IFilter. If you are using the Adobe 8, then you will need to download this from the adobe site. If you are using version 9.0, the IFilter is already installed on the machine.
Enable the PDF File Indexing
  • Download Adobe Reader 9.0, which includes IFilter 9.x.x.x, from http://www.adobe.com/products/acrobat/
  • Download the Acrobat PDF Picture. This will be used to display the pdf file icon. http://www.adobe.com/misc/linking.html
  • Add the PDF file type to the Extensions List for WSS search by editing the registry
    • Start regedit
    • Open the key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Applications\{Random GUID}\Gather\Search\Extensions\ExtensionList
    • Add PDF to the list as a new String Value. Use a new high value e.g. if 37 is the highest value, use "38" as the key with the value "pdf"
  • Add the Acrobat PDF picture to the SharePoint templates directory. Copy the Acrobat PDF picture called pdficon_small.gif in the 12 Hive\TEMPLATE\IMAGES folder, e.g. %programfiles%\Common Files\Microsoft Shared\Web Server Extensions\12\TEMPLATE\IMAGES.
  • Bind the Acrobat PDF picture to the PDF file type
    • Open the 12 Hive\TEMPLATE\XML\DOCICON.XML file
    • Find the part
    • Add the following mapping:
  • Set IFilter mapping in registry
    • Start regedit
    • Open the key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Setup\ContentIndexCommon\Filters\Extension\
    • Add (or modify) the .pdf key
    • Add a Multi-String value with value {E8978DA6-047F-4E3D-9C78-CDBE46041603} or modify if another GUID value already exists.
    • Open the key HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Setup\ContentIndexCommon\Filters\Extension\
    • Add (or modify) the .pdf key
    • Add a Multi-String value with value {E8978DA6-047F-4E3D-9C78-CDBE46041603} or modify if another GUID value already exists.
  • Add the Adobe Reader folder to the environment path variable
    • Right Click on My Computer
    • Open Properties
    • Open the Advanced tab
    • Go to the Environment variables
    • Edit the Path variable
    • Add your Reader folder to the Path list, e.g. C:\Program Files\Adobe\Reader 9.0\Reader
  • Restart the Search service by restarting your server or executing the following commands:
    • Run: net stop osearch
    • Run: net start osearch
  • Crawl the PDF documents
    • Existing PDF documents that were crawled before the Adobe PDF IFilter has been installed are not indexed during an incremental crawl. You have to edit each existing PDF file to trigger the crawler to reindex the file during an incremental crawl. It´s easier to run a full crawl after you have installed the Adobe PDF IFilter.
With this, the MOSS crawlers will crawl the content of the PDF files also, enabling the users to retrieve data from the  PDF files also

Saturday, January 23, 2010

Display PDF File in MOSS Portal

Steps to display PDF file in a MOSS / Sharepoint Portal
Usually, when i get down to write a blog, i tend to get creative and put a lot of language around it. This time, i am going to cut the crap and get straight to it.
I want to display a PDF file in my MOSS Portal. So how do i go about it?
Following are some simple approach that your dev team can take.
1. Upload the pdf to a doc library
Add a Content Editor Web part to your web site
Copy paste the below code to the HTML mode in the content editor
< embed height=”500px” width=”500px”
src=”http://siteurl/test.pdf” type=”application/pdf”>
2. Use a simple page viewer web part and give the path of the pdf file in the document library
3. Many a case, the user will want to dynamically pick the pdf file based on some parameter. This was my case too. A simple web part will do the trick for this. Create a custom web part that spins out the HTML mentioned in approach 1. With a difference that the url of the pdf file will be generated at the run time based the business requirement.

Friday, January 22, 2010

Site Definition Vs Site Template

Choosing the Site template is the first decision that the developer makes while designing the SharePoint / MOSS portal. But What is really a site template? Is it same as the site definition? Time to take a deeper look into this.

Site Definition as well as the site template defines the content of a the portal. This specifies what all the web site will contain. This may include content types, lists, features, Web parts, event handlers, Navigation elements etc.

In a layman's term, site definition is nothing but a xml file that defines what all are the items that needs to be included in the new site that is being provisioned. In contrast with Site template is built on top of the site definition, recording the changes on the site dentition.

When to Use Site Definitions

Customizing portal sites or SharePoint sites using site definitions is apt for third-party developers and server administrators. Site definition requires access to the file system of the Web server. The

Administrator should always be involved in the deployment of site definitions.

Although deploying a site definition requires more work, site definitions typically perform better when cached on the file system instead of in the database. In addition, you can achieve a higher level of customization by directly editing all the schema files without depending on the existing site definition as a site template does. Also, if you want to introduce new file types, view styles, and drop-down edit menus, you need to edit the schema files that make up the site definition.

Custom site definitions are upgrade independent. Subsequent upgrades to MOSS Products and Technologies may overwrite existing default site definitions. Using custom site definitions excludes your sites from potential upgrade issues.

However, there is no easy way to modify site definitions once they are deployed. There is always the possibility of breaking existing deployed sites derived from the site definition once you modify an existing site definition. You can only add to the site definition once it is deployed.

When to Use Site Templates

Site template when compared to site definitions, are easy to create and deploy. You can make all customizations through the user interface. Also, you do not need to be a server administrator on the Web server to create and deploy site templates. Modifying a site template will not create issues in existing sites created by the template. It also eases the deployment as the template data is stored centrally in the configuration database.

Site templates are slower than the site definition as they are stored in the database. Templates in the database are compiled and executed every time a page is rendered. WSS does some performance optimization whereby it stores templates on the local Web server and a ghost of the page in the configuration database. However, you can easily prevent WSS from using a copy of the page by using Web Folders to open, modify, and save it. From this point forward, the database is used to render the page.

Site templates only work on SharePoint sites that are not portal sites (not based on the SPS templates). Furthermore, site templates are not ideally suited for a development environment. In effect, they are still customizations of a site definition. If the site definition does not exist on the server, the site template fails.

Generally, site templates are not as efficient as site definitions in a large-scale environment


Here is the list of the site definitions available on the MOSS 2007


ID Name Type
STS#0 Team Site WSS
STS#1 Blank Site WSS
STS#2 Document Workspace WSS
MPS#0 Basic Meeting Workspace WSS
MPS#1 Blank Meeting Workspace WSS
MPS#2 Decision Meeting Workspace WSS
MPS#3 Social Meeting Workspace WSS
MPS#4 Multipage Meeting Workspace WSS
CENTRALADMIN#0 Central Admin Site WSS
WIKI#0 Wiki Site WSS
BLOG#0 Blog WSS
BDR#0 Document Center MOSS
OFFILE#1 Records Center MOSS
OSRV#0 Shared Services Administration Site MOSS
SPS#0 SharePoint Portal Server Site MOSS
SPSPERS#0 SharePoint Portal Server Personal Space MOSS
SPSMSITE#0 Personalization Site MOSS
SPSMSITE#0 Contents area Template MOSS
SPSTOPIC#0 Topic area template MOSS
SPSNEWS#0 News Site MOSS
CMSPUBLISHING#0 Publishing Site MOSS
BLANKINTERNET#0 Publishing Site MOSS
BLANKINTERNET#1 Press Releases Site MOSS
BLANKINTERNET#2 Publishing Site with Workflow MOSS
SPSNHOME#0 News Site MOSS
SPSSITES#0 Site Directory MOSS
SPSCOMMU#0 Community area template MOSS
SPSREPORTCENTER#0 Report Center MOSS
SPSPORTAL#0 Collaboration Portal MOSS
SRCHCEN#0 Search Center with Tabs MOSS
PROFILES#0 Profiles MOSS
BLANKINTERNETCONTAINER#0 Publishing Portal MOSS
SPSMSITEHOST#0 My Site Host MOSS
SRCHCENTERLITE#0 Search Center MOSS


Time to see how it maps to 2010