Software & Technology – Page 4

Software & Technology

Web based drawing canvas (no HTML5)

November 9, 2010

I have a need for a web based drawing tool, much like Microsoft Paint – just a simple canvas that allows me to draw on it. I’ve searched around a bit and have found that the majority of the tools are Flash based and the ones that are not are lacking in one way or another.

So I decided to create a simple, purely web based drawing tool. This works in all browsers and even works on my iPad (although I have to click each square to form the drawing on the iPad since dragging simply moves the screen). The canvas, with a drawing, is shown in Figure 1.

It uses some simple JavaScript and CSS to achieve the effect. The grid lines, border, and cell size are all controlled by CSS. Just click or click and drag to draw. Do the same while holding down the SHIFT key to erase (draw in alternate color). I chose to use the SHIFT key combination for color 2 to avoid conflict with context menus when right clicking and OS specific uses for ALT and CTRL.

Figure 1: Picture of a drawing done on “Canvas” (demo link below).

Everything is encapsulated inside the Canvas object, so it is really easy to use. All you need is a DIV, a little bit of CSS, and the couple lines of JavaScript below.

Here’s how to use it:

// Creates the canvas - just make sure you have a DIV with an id of "drawingArea"
// The last parameter of false allows editing - set to true for read only
// Note that these are squares - actual square size is set via CSS
var myCanvas = new Canvas("drawingArea", 60, 40, false);

// This will change the drawing colors
myCanvas.setFillColor("#CCCCCC"); // #000 by default
myCanvas.setEraseColor("#555555"); // #FFF by default

// This will resize the canvas, preserving what portions of the drawing it can
myCanvas.resize(100, 10); // 100 wide by 10 high

// This will save the contents of the canvas down to a JSON string
var myData = myCanvas.save();

// This will load a drawing from a JSON string
myCanvas.load(myData);

You can view a working demo here.

You can find the JS file here.

Note that you’ll also need the JSON2 JS file as well from here.

An improvement idea:
When mouse moves fast, lines have gaps in them. It would be pretty easy to write an algorithm that figures out empty squares between two points and fills them in. Alternatively, it wouldn’t be too difficult to handle this without tables by tracking the mouse location within the DIV then drawing points/lines.

Feel free to use this as you see fit. I just ask that you:

Leave the copyright information at the top of the JS file in place.
Don’t sell the code or any product that has this code as the core feature without my consent.

Enjoy!

Software & Technology

JavaScript Event to an Object Instance

November 5, 2010

3 Comments

I searched the web for quite a while and couldn’t find a good answer to this problem, so I figured I’d post it to save the next person a little time.

I have a situation where I want to send an event to an instance of a JavaScript object. I could get it to go to the object itself, but was having trouble getting it to apply to only a specific instance of the object.

The answer was surprisingly simple – just wrap the pointer to your event handler function in an anonymous function. Doing this gives the anonymous function (closure) access to the variables currently in scope, which means that I have access to my object instance now.

The code is below. The commented out event listener add lines were the original attempt that would result in “undefined” being printed to the screen instead of my instance variable value. The uncommented version correctly calls my instance, which prints “test” to the debug area of the screen each time it is clicked.

// The object to handle the event...
function TestListenerObj(txt) {
    this.eventText = txt;
    this.handleEvent = function() {
        var debug = document.getElementById("debug");
        debug.innerHTML = debug.innerHTML + this.eventText + "<br/>";
        return false;
    }
}

// Sets up the event handler
function init() {
    var link = document.getElementById("test_link");
    var obj = new TestListenerObj("test");

    if (link.addEventListener) {
        // link.addEventListener("click", obj.handleEvent, false); // Calls "class" level handleEvent()
        link.addEventListener("click", function() { obj.handleEvent(); }, false); // Calls instance level handleEvent()
    } else {
        // link.attachEvent("onclick", obj.handleEvent ); // Calls "class" level handleEvent()
        link.attachEvent("onclick", function() { obj.handleEvent(); }); // Calls instance level handleEvent()
    }
}

For a working example go here. For complete source visit this link and view source. Just click the “Test” link on the example to see the event be handled by the object, which in this case just prints to the debug area.

Software & Technology

Ruby based image crawler

May 18, 2010

1 Comment

I don’t write much code these days and felt it was time to sharpen the saw.

I have a need to download a ton of images from a site (I got permission first…) but it is going to take forever to do by hand. Even though there are tons of tools out there for image crawling I figured this would be a great exercise to brush up on some skills and delve further into a language I am still fairly new to, Ruby. This allows me to use basic language constructs, network IO, and file IO, all while getting all the images I need in a fast manner.

As I have mentioned a few times on this blog, I am still new to Ruby so any advice for how to make this code cleaner is appreciated.

You can download the file here: http://mcdonaldland.info/files/crawler/crawl.rb

Here is the source:

require 'net/http'
require 'uri'

class Crawler

  # This is the domain or domain and path we are going
  # to crawl. This will be the starting point for our
  # efforts but will also be used in conjunction with
  # the allow_leave_site flag to determine whether the
  # page can be crawled or not.
  attr_accessor :domain

  # This flag determines whether the crawler will be
  # allowed to leave the root domain or not.
  attr_accessor :allow_leave_site

  # This is the path where all images will be saved.
  attr_accessor :save_path

  # This is a list of extensions to skip over while
  # crawling through links on the site.
  attr_accessor : omit_extensions # Remove space between : and o - WordPress tries to make this a smiley if I leave them together.

  # This keeps track of all the pages we have visited
  # so we don't visit them more than once.
  attr_accessor :visited_pages

  # This keeps track of all the images we have downloaded
  # so we don't download them more than once.
  attr_accessor :downloaded_images

  def begin_crawl
    # Check to see if the save path ends with a slash. If so, remove it.
    remove_save_path_end_slash

    if domain.nil? || domain.length < 4 || domain[0, 4] != "http"
      @domain = "http://#{domain}"
    end

    crawl(domain)
  end

  private

  def remove_save_path_end_slash        
    sp = save_path[save_path.length - 1, 1]

    if sp == "/" || sp == "\\"
      save_path.chop!
    end
  end

  def initialize
    @domain = ""
    @allow_leave_site = false
    @save_path = ""
    @omit_extensions = []
    @visited_pages = []
    @downloaded_images = []
  end

  def crawl(url = nil)

    # If the URL is empty or nil we can move on.
    return if url.nil? || url.empty?

    # If the allow_leave_site flag is set to false we
    # want to make sure that the URL we are about to
    # crawl is within the domain.
    return if !allow_leave_site && (url.length < domain.length || url[0, domain.length] != domain)

    # Check to see if we have crawled this page already.
    # If so, move on.
    return if visited_pages.include? url

    puts "Fetching page: #{url}"

    # Go get the page and note it so we don't visit it again.
    res = fetch_page(url)
    visited_pages << url

    # If the response is nil then we cannot continue. Move on.
    return if res.nil?

    # Some links will be relative so we need to grab the
    # document root.
    root = parse_page_root(url)

    # Parse the image and anchor tags out of the result.
    images, links = parse_page(res.body)

    # Process the images and links accordingly.
    handle_images(root, images)
    handle_links(root, links)
  end

  def parse_page_root(url)
    end_slash = url.rindex("/")
    if end_slash > 8
      url[0, url.rindex("/")] + "/"
    else
      url + "/"
    end
  end

  def discern_absolute_url(root, url)
    # If we don't have an absolute path already, let's make one.            
    if !root.nil? && url[0,4] != "http"

      # If the URL begins with a slash then it is domain
      # relative so we want to append it to the domain.
      # Otherwise it is document relative so we want to
      # append it to the current directory.
      if url[0, 1] == "/"
        url = domain + url
      else
        url = root + url
      end
    end    

    while !url.index("//").nil?
      url.gsub!("//", "/")
    end

    # Our little exercise will have replaced the two slashes
    # after http: so we want to add them back.
    url.gsub!("http:/", "http://")

    url
  end

  def handle_images(root, images)
    if !images.nil?
      images.each {|i|

        # Make sure all single quotes are replaced with double quotes.
        # Since we aren't rendering javascript we don't really care
        # if this breaks something.
        i.gsub!("'", "\"")        

        # Grab everything between src=" and ".
        src = i.scan(/src=[\"\']([^\"\']+)/i)
        if !src.nil?
          src = src[0]
          if !src.nil?
            src = src[0]
          end
        end

        # If the src is empty move on.
        next if src.nil? || src.empty?

        # We want all URLs we follow to be absolute.
        src = discern_absolute_url(root, src)

        save_image(src)
      }
   end
  end

  def save_image(url)
    # Check to see if we have saved this image already.
    # If so, move on.
    return if downloaded_images.include? url        

    # Save this file name down so that we don't download
    # it again in the future.
    downloaded_images << url

    # Parse the image name out of the url. We'll use that
    # name to save it down.
    file_name = parse_file_name(url)

    while File.exist?(save_path + "/" + file_name)
      file_name = "_" + file_name
    end

    # Get the response and data from the web for this image.
    response = fetch_page(url)

    # If the response is not nil, save the contents down to
    # an image.
    if !response.nil?
      puts "Saving image: #{url}"    

      File.open(save_path + "/" + file_name, "wb+") do |f|
        f << response.body
      end
    end
  end

  def parse_file_name(url)
    # Find the position of the last slash. Everything after
    # it is our file name.
    spos = url.rindex("/")    
    url[spos + 1, url.length - 1]
  end

  def handle_links(root, links)
    if !links.nil?
      links.each {|l|    

        # Make sure all single quotes are replaced with double quotes.
        # Since we aren't rendering javascript we don't really care
        # if this breaks something.
        l.gsub!("'", "\"")

        # Grab everything between href=" and ".
        href = l.scan(/(\href+)="([^"\\]*(\\.[^"\\]*)*)"/i)
        if !href.nil?
          href = href[0]
          if !href.nil?
            href = href[1]
          end
        end

        # We don't want to follow mailto or empty links
        next if href.nil? || href.empty? || (href.length > 6 && href[0,6] == "mailto")

        # We want all URLs we follow to be absolute.
        href = discern_absolute_url(root, href)

        # Down the rabbit hole we go...
        crawl(href)
      }
    end
  end

  def parse_page(html)    
    images = html.scan(/<img [^>]*>/i)
    links = html.scan(/<a [^>]*>/i)

    return [ images, links ]
  end

  def fetch_page(url, limit = 10)
    # Make sure we are supposed to fetch this type of resource.
    return if should_omit_extension(url)

    # You should choose better exception.
    raise ArgumentError, 'HTTP redirect too deep' if limit == 0

    begin
      response = Net::HTTP.get_response(URI.parse(url))
    rescue
      # The URL was not valid - just log it can keep moving
      puts "INVALID URL: #{url}"
    end

    case response
      when Net::HTTPSuccess     then response
      when Net::HTTPRedirection then fetch_page(response['location'], limit - 1)
      else
        # We don't want to throw errors if we get a response
        # we are not expecting so we will just keep going.
        nil
    end
  end

  def should_omit_extension(url)
    # Get the index of the last slash.
    spos = url.rindex("/")

    # Get the index of the last dot.
    dpos = url.rindex(".")

    # If there is no dot in the string this will be nil, so we
    # need to set this to 0 so that the next line will realize
    # that there is no extension and can continue.
    if dpos.nil?
      dpos = 0
    end

    # If the last dot is before the last slash, we don't have
    # an extension and can return.
    return false if spos > dpos

    # Grab the extension.
    ext = url[dpos + 1, url.length - 1]

    # The return value is whether or not the extension we
    # have for this URL is in the omit list or not.
    omit_extensions.include? ext

  end

end

# TODO: Update each comparison to be a hash comparison (possibly in a hash?) in order
# to speed up comparisons. Research to see if this will even make a difference in Ruby.

crawler = Crawler.new
crawler.save_path = "C:\SavePath"
crawler.omit_extensions = [ "doc", "pdf", "xls", "rtf", "docx", "xlsx", "ppt",
                          "pptx", "avi", "wmv", "wma", "mp3", "mp4", "pps", "swf" ]
crawler.domain = "http://www.yoursite.com/"
crawler.allow_leave_site = false
crawler.begin_crawl

# Bugs fixed:
# 1. Added error handling around call to HTTP.get_response in order to handle timeouts and other errors
#
# 2. Added check upon initialization to remove the trailing slash on the save path, if it exists.

Category - Software & Technology

Web based drawing canvas (no HTML5)

JavaScript Event to an Object Instance

Ruby based image crawler

Jason McDonald

Ads don't cover rent...

Recent Posts

Categories