Tuesday, November 13, 2007

Lab 2 Postmortem

Mapping the URLs
The most complex part of this lab was matching the URL signatures. I played with Tomcat for a while, but realized that using mod_rewrite with Apache would be much easier. Here are the steps:

1) mod_rewrite comes with Apache 2.2 so just uncomment the module in the httpd.conf file.
2)add the following lines to the end of httpd.conf:

# Send cs462 labs to worker named default
JkMount /cs462lab2/* default

This will send all URI's which start with "/cs462lab2" to tomcat.
3) add the file ".htaccess" to the default docs dir of Apache. This file contains the regex rules for rewriting the URL.

4) add the following lines:

RewriteEngine on
RewriteRule ^index$ cs462lab2/SiteLister?option=index [NC]
RewriteRule ^domain/(([a-z0-9]([-a-z0-9]*[a-z0-9])?\.)+((a[cdefgilmnoqrstuwxz]|aero|arpa)|(b[abdefghijmnorstvwyz]|biz)|(c[acdfghiklmnorsuvxyz]|cat|com|coop)|d[ejkmoz]|(e[ceghrstu]|edu)|f[ijkmor]|(g[abdefghilmnpqrstuwy]|gov)|h[kmnrtu]|(i[delmnoqrst]|info|int)|(j[emop]|jobs)|k[eghimnprwyz]|l[abcikrstuvy]|(m[acdghklmnopqrstuvwxyz]|mil|mobi|museum)|(n[acefgilopruz]|name|net)|(om|org)|(p[aefghklmnrstwy]|pro)|qa|r[eouw]|s[abcdeghijklmnortvyz]|(t[cdfghjklmnoprtvwz]|travel)|u[agkmsyz]|v[aceginu]|w[fs]|y[etu]|z[amw]))/?$ cs462lab2/SiteLister?option=domain&domain=$1 [NC,L]

That's it!

Calling HTTP Get
Because I'm using Java and Servlets, I have the luxury of choosing from many open-source libraries.
1) To call S3 for the data, I used HttpClient which is great for Http Get and Post calls.
2) To convert the JSON data to Java types, I used json-lib.
3) To format the XML, I used JDOM.

These libraries minimized the code I had to write.

2 comments:

Hilton said...

Why the crazy long regex?

Unknown said...

mod_rewrite is extremely powerful. I attended a mod_rewrite session at ApacheCon and found that it can do much more than I ever thought it could.

That being said, I used Java/Tomcat/Servlets for my Lab 3 and I found just using the servlet mapping mechanism was far easier than trying to use mod_rewrite. My lab 3 web.xml file looks something like:

<servlet>
  <servlet-name>Domains</servlet-name>
  <servlet-class>edu.byu.cs.mah292.DomainsServlet</servlet-class>
</servlet>

<servlet>
  <servlet-name>Submit</servlet-name>
  <servlet-class>edu.byu.cs.mah292.SubmitServlet</servlet-class>
</servlet>

<servlet-mapping>
  <servlet-name>Submit</servlet-name>
  <url-pattern>/submit</url-pattern>
</servlet-mapping>

<servlet-mapping>
  <servlet-name>Domains</servlet-name>
  <url-pattern>/*</url-pattern>
</servlet-mapping>

As you can see, there are no regular expressions and it's very simple and very direct.