Wednesday, March 30, 2011

Validating URL/URI in Ruby on Rails


Wouldn't it be nice if you could validate the format and the existence of a URI/URL provided by the users in your Rails application? Well, I thought so. So after a little research I put together avalidates_uri_existence_of.rb which does exactly that. Take a look below:
equire 'net/http'
 
# Original credits: http://blog.inquirylabs.com/2006/04/13/simple-uri-validation/
# HTTP Codes: http://www.ruby-doc.org/stdlib/libdoc/net/http/rdoc/classes/Net/HTTPResponse.html
 
class ActiveRecord::Base
def self.validates_uri_existence_of(*attr_names)
configuration = { :message => "is not valid or not responding", :on => :save, :with => nil }
configuration.update(attr_names.pop) if attr_names.last.is_a?(Hash)
 
raise(ArgumentError, "A regular expression must be supplied as the :with option of the configuration hash") unless configuration[:with].is_a?(Regexp)
 
validates_each(attr_names, configuration) do |r, a, v|
if v.to_s =~ configuration[:with] # check RegExp
begin # check header response
case Net::HTTP.get_response(URI.parse(v))
when Net::HTTPSuccess then true
else r.errors.add(a, configuration[:message]) and false
end
rescue # Recover on DNS failures..
r.errors.add(a, configuration[:message]) and false
end
else
r.errors.add(a, configuration[:message]) and false
end
end
end
end
 
Code above is a barebones check which first validates the URL format against a provided regular expression (http://www.etc...) and if it passes, it sends a HEAD (header) request and checks if we get a 200 Code (HTTPSuccess) back. You could easily extend it to handle redirects, etc.
How do you use it in your application? Copy the code below into a validates_uri_existence_of.rb and drop it into your lib directory in your Rails application. Next, open up your environment.rb (under /config) and at the bottom of the file add:
equire 'validates_uri_existence_of'
Almost there, now we can use our new custom validation in any model by calling:
validates_uri_existence_of :url, :with =>
/(^$)|(^(http|https):\/\/[a-z0-9]+([\-\.]{1}[a-z0-9]+)*\.[a-z]{2,5}(([0-9]{1,5})?\/.*)?$)/ix
 
:url is the field you want to validate, and with specifies the Regular expression to check the URL against. One in the example should work for almost all valid http / https links. I haven't found a problem with it yet. Theoretically, you could change the expression to validate only URL's within a specific domain or a set of domains.
So, now we can validate the format of the URL and check that the URL is alive and breathing (returns Code 200).

No comments:

Post a Comment