Archive for New features

New ipsojobs summer features

The summer is already here, almost in Barcelona, today is better to stay in your workplace with the AC than in the street.

For that reason we’ve been working so hard in new features :)

Expiration Warnings:

We’ve introduced new warnings when your job posting is about to expire. To use this new feature simply introduce a  job posting and then you’ll be asked to fill your contact email.

We’ll send you a link to renew the job posting (if you want) near the end of the publishing period.

When you renew the job posting, you can optionally introduce again an email for a further expiration warning.

Better SEO in Pagination:

The pagination for big cities have become a little bulky, for that reason we’ve simplified the pagination and added a more descriptive title in the links (better than 1, 2, 3 …)

Ultra fast, spam detection for admins:

We’ve set up a new ultra-fast way to remove spammy job postings, we’ll send an email to all the admins telling the secrets of this new feature.

Comments

How to create a simple but powerful CDN with Google App Engine (GAE)

The main purpose when I started to look at Google App Engine (3 days ago) was to use it as a “CDN for the rest of us”, a way to cache static content (initially) and have this content distributed along all the infrastructure of Google (maybe the most powerful cloud rigth now)

What we want?:

  • Create a CDN easy to update and free of charge for static resources (images, css, js)
  • Consume as less bandwidth as possible leveraging the If-Modified-Since/Last-Modified/304 Not Modified model

Hands-on:

The first approach, of course, was to look on Google for some help, the post of Andreas Krohn helped a lot to start.

But I want to go further and take care of modern browsers If-Modified-Since requests, then the google framework and a little of Python comes to the rescue.

Note: I’m assuming you’ve already installed the Python environment and the Google App Engine SDK

First of all let me give you two little .bat files that are useful:

Start the test webserver (test.bat):
dev_appserver.py c:\ipsojobscloud

Upload your application to the cloud (update.bat):
appcfg.py update c:\ipsojobscloud

Note: simply change c:\ipsojobscloud for the folder you are working in and contains your app.yaml

Then I’ve setup the app.yaml, it’s very simple (16 lines):

application: ipsojobscloud
version: 1
runtime: python
api_version: 1

handlers:
- url: /favicon.ico
  static_files: favicon.ico
  upload: favicon.ico

- url: /images/favicon.ico
  static_files: favicon.ico
  upload: favicon.ico

- url: /.*
  script: cacheheaders.py

This app.yaml simply tells the GAE the name of the application (ipsojobscloud) the version we’re working on (use only the major release number, GAE automatically takes care of the .x when you upload).

Then we specify two handlers for the favicon.ico static file and a catch-all handler that redirects our requests to the Python script cacheheaders.py

With that environment set, we simply code the cacheheaders.py file, let’s see it in detail:

The skeleton of the file is:

import wsgiref.handlers
from google.appengine.ext import webapp

class MainPage(webapp.RequestHandler):

  def get(self, dir, file, extension):
...

def main():
  application = webapp.WSGIApplication([(r'/(.*)/([^.]*).(.*)’, MainPage)], debug=False)
  wsgiref.handlers.CGIHandler().run(application)

if __name__ == “__main__”:
  main()

Here we are importing the webapp framework and setting the class MainPage, in the main section the only change in the sample GAE is
the regular expression that we used to match our requests, the expression r’/(.*)/([^.]*).(.*)’ is telling that we are using regular expressions (r)
, then take one slash, followed by an arbitray number of characters and another slash /(.*)/ the parentesis tells the regular expression to keep the string beetween the two slashes as a variable. The next part ([^.]*). takes all caracters except a dot and puts them in to the second variable and finally, we’ll take the rest of the input as a variable with (.*)

This regular expression is designed to only capture paths like /images/helloworld.gif where variables are images, helloworld and gif respectively

Note: Of course that’s not a complete solution, we can only have one folder depth, but it’s a good readers exercice to improve that :)

The part that you need to know is that when a request arrives it’s mapped to the get function with the parameters dir, file and extension (and don’t forget the first “self” parameter)

Let’s see the code of the get function in detail:

First, check the validity of the parameters received and set the correct content-type based on the extension:

  def get(self, dir, file, extension):
    if (dir!='js' and dir!='css' and dir!='images'):
      self.error(404)
      return

    if (extension!='js' and extension!='css' and extension!='jpg' and extension!='png' and extension!='gif'):
      self.error(404)
      return

    if extension=='js':
      self.response.headers['Content-Type'] = ‘application/x-javascript’
    elif extension==’css’:
      self.response.headers['Content-Type'] = ‘text/css’
    elif extension==’jpg’:
      self.response.headers['Content-Type'] = ‘image/jpeg’
    elif extension==’gif’:
      self.response.headers['Content-Type'] = ‘image/gif’
    elif extension==’png’:
      self.response.headers['Content-Type'] = ‘image/png’

Note: the firts two ifs are completely optional, we check if the dir variable is in our valid list of dirs (js, css, images) and if the extension of the file is in our allowed list (js, css, jpg, png, gif), you have to change that check or completely remove it at your convenience.

And now the tricky part:

    try:
      import os
      import datetime
      path = dir+'/'+file+"."+extension
      info = os.stat(path)
      lastmod = datetime.datetime.fromtimestamp(info[8])
      if self.request.headers.has_key(’If-Modified-Since’):
        dt = self.request.headers.get(’If-Modified-Since’).split(’;')[0]
        modsince = datetime.datetime.strptime(dt, “%a, %d %b %Y %H:%M:%S %Z”)
        if modsince >= lastmod:
        # The file is older than the cached copy (or exactly the same)
          self.error(304)
          return
        else:
        # The file is newer
          self.output_file(path, lastmod)
      else:
        self.output_file(path, lastmod)
    except:
      self.error(404)
      return

First we import some packages (os, datetime), then create a variable “path” with the full path of the file we want to retrieve

path = dir+'/'+file+"."+extension

Then, take the info of the file from the Operating System and keep the last modified date into lastmod variable, note that if an error occurs (non existing file for example, the except part will be executed, returning a 404 not found response to the browser).

In the following lines we scan the headers of the request, looking for an If-Modified-Since header, if we found it take the date part

      if self.request.headers.has_key('If-Modified-Since'):
        dt = self.request.headers.get('If-Modified-Since').split(';')[0]
        modsince = datetime.datetime.strptime(dt, “%a, %d %b %Y %H:%M:%S %Z”)

Then compare the last modification date of the file against the ifmodifiedsince date and act accordingly, note that self.error(304) will return a response code 304 (Not-Modified) to the browser:

        if modsince >= lastmod:
        # The file is older than the cached copy or the same
          self.error(304)
          return
        else:
        # The file is newer
          self.output_file(path, lastmod)

The self.output_file(path, lastmod) is a function we have defined to avoid code duplication:

  def output_file(self, path, lastmod):
    import datetime
    try:
      self.response.headers['Cache-Control']=’public, max-age=31536000′
      self.response.headers['Last-Modified'] = lastmod.strftime(”%a, %d %b %Y %H:%M:%S GMT”)
      expires=lastmod+datetime.timedelta(days=365)
      self.response.headers['Expires'] = expires.strftime(”%a, %d %b %Y %H:%M:%S GMT”)
      fh=open(path, ‘r’)
      self.response.out.write(fh.read())
      fh.close
      return
    except IOError:
      self.error(404)
      return

As you can see we imported datetime to manipulate dates and try to do the following:

  • Set the header Cache-Control, to be as much cacheable as posible
  • Set the header Last-Modified (IMPORTANT ! when we send for the first time the file to the browser it keeps the Last-Modified date of the file, this value is the value that will send in the next If-Modified-Since requests, when we usually will respond 304 not-modified!)
  • Calculate an expires date in the future (we’ve put 365 days)
  • Set the Expires header with this value (last-modified+365 days)
  • Open the file and send it to the output and finally close the file
  • return, because when we output the file we’re done

Note: If something happens we returned an standard response of Not Found (404)

Conclusions:

We’ve improved the latency in the requests of static files putting them into the cloud, and keep the bandwidth used in the cloud to a minimum answering correctly to the If-Modified-Since requests and only in about 70 lines of code

One of the advantatges of Google App Engine above Amazon S3 is that GAE is free up 5 million page views a month, that give us a good chance to try this kind of features without spending cash.

You can see the speed improvement on-line in all the ipsojobs.com pages rigth now !

Some screenshots taken from firebug:

First request:

First request (not cached)

Second request:

Second request, cached, note the 304 responses

Detail of a request:

Sample cached response, details

Full source of cacheheaders.py:

import wsgiref.handlers
from google.appengine.ext import webapp

class MainPage(webapp.RequestHandler):

  def output_file(self, path, lastmod):
    import datetime
    try:
      self.response.headers['Cache-Control']=’public, max-age=31536000′
      self.response.headers['Last-Modified'] = lastmod.strftime(”%a, %d %b %Y %H:%M:%S GMT”)
      expires=lastmod+datetime.timedelta(days=365)
      self.response.headers['Expires'] = expires.strftime(”%a, %d %b %Y %H:%M:%S GMT”)
      fh=open(path, ‘r’)
      self.response.out.write(fh.read())
      fh.close
      return
    except IOError:
      self.error(404)
      return

  def get(self, dir, file, extension):
    if (dir!=’js’ and dir!=’css’ and dir!=’images’):
      self.error(404)
      return

    if (extension!=’js’ and extension!=’css’ and extension!=’jpg’ and extension!=’png’ and extension!=’gif’):
      self.error(404)
      return

    if extension==’js’:
      self.response.headers['Content-Type'] = ‘application/x-javascript’
    elif extension==’css’:
      self.response.headers['Content-Type'] = ‘text/css’
    elif extension==’jpg’:
      self.response.headers['Content-Type'] = ‘image/jpeg’
    elif extension==’gif’:
      self.response.headers['Content-Type'] = ‘image/gif’
    elif extension==’png’:
      self.response.headers['Content-Type'] = ‘image/png’

    try:
      import os
      import datetime
      path = dir+’/'+file+”.”+extension
      info = os.stat(path)
      lastmod = datetime.datetime.fromtimestamp(info[8])
      if self.request.headers.has_key(’If-Modified-Since’):
        dt = self.request.headers.get(’If-Modified-Since’).split(’;')[0]
        modsince = datetime.datetime.strptime(dt, “%a, %d %b %Y %H:%M:%S %Z”)
        if modsince >= lastmod:
        # The file is older than the cached copy (or exactly the same)
          self.error(304)
          return
        else:
        # The file is newer
          self.output_file(path, lastmod)
      else:
        self.output_file(path, lastmod)
    except:
      self.error(404)
      return

def main():
  application = webapp.WSGIApplication([(r'/(.*)/([^.]*).(.*)’, MainPage)], debug=False)
  wsgiref.handlers.CGIHandler().run(application)

if __name__ == “__main__”:
  main()

Comments (5)

Latest news of ipsojobs.com

We’ve been very busy in ipsojobs in the last weeks.

So busy that we don’t have time to post :)

Let us fire some facts:

  • The traffic is growing steadily giving our administrators more ad cash every month.
  • Alexa has changed the measuring system, taken us off the top 100.000 temporarily, now, we’re back and some days we’re reaching the top 50.000 mark, that’s great with the new system!
  • We’ve opened Moscow and Sankt-Peterburg, in Russia and the city administrator of those cities has made the Russian translation of ipsojobs.com
  • We’re starting to integrate broadbean offers in our site giving us a headstart in the UK market. We hope this agreement will start to give results in the following weeks.
  • ipsojobs.com have reached the 11,000 active jobs mark, that’s a big milestone for us !
  • We’ve done some minor SEO adjustments, basically putting the word “Jobs” or “Trabajos” in some important links, between cities and to the worldwide home page.
  • At the technical level we’ve now two dedicated servers, one for the database and one for the frontend and this give us enough power to reach a 10-fold traffic grow.
  • And we also added some anti-spam features to keep the quality of all the ipsojobs.com cities as good as always

More news to come
The ipsojobs.com team

Comments

We’ve got NEW HOME

Hi all,

we’ve improved the ipsojobs.com home page, showing the more active cities in a pretty “tag cloud”, every ipsojobs zone have it’s own metrics of what is relevant and decides the font-size depending on the current active job postings in the city relative to the area.

Here the first screenshot of the new home:

New Home screenshot

We pursue three objectives:

  1. Promote the competence between the city managers in the same area
  2. Save screen space, moving the inactive or less active cities to the bottom
  3. Show most active cities with a premium size

Go to see the new home of ipsojobs.com NOW!

Comments

Ipsojobs and Recruit.net integration

Recruit.net page showing Ipsojobs.com job offersIpsojobs.com is proud to annunce the inclusion in the recruit.net search index.

Recruit.net is the leader in vertical job search in Australia, New Zealand, China, Malaysia, India, Japan and Singapore. This new vertical search engine integration will help Ipsojobs.com to be more popular in those areas.

We are receiving visitors from recruit.net since February the 19th (19/02/2008). Welcome all !

Comments

Ipsojobs and SnipTime integration

ScreenShot of a sniptime/ipsojobs integration

Ipsojobs.com is proud to annunce the inclusion in the sniptime search index.

Sniptime is one of the most important players in the job search market in Spain, this new vertical search engine integration will help Ipsojobs.com to be more popular and more relevant in the Spanish market.

We are receiving visitors from sniptime.com since February the 21th (21/02/2008). Welcome all !

Comments

Ipsojobs and SimplyHired integration

Search result of simplyhired.com showing an ipsojobs.com job offer

Ipsojobs.com is proud to annunce the inclusion in the simplyhired search index.

SimplyHired is one of the most important players in the job search market in the US, this new vertical search engine integration will help Ipsojobs.com to be more popular and more relevant in the US market.

We are receiving visitors from simpyhired.com since February the 19th (19/02/2008). Welcome all !

Comments

New IPSOJOBS API for developers and geocoders

We’ve just launched a new API for all the developers/geocoders around de world.

As you may know Ipsojobs.com is a global job bank, fast easy and free for everybody !

The new API allows you to locate job offers near you, you only need to give us your latitude and longitude and we will return you a nice XML containing the nearest Ipsojobs cities and the last job postings in that city (if any).

XML of the new API

You can see a couple of examples:
Jobs near Barcelona
Jobs near Madrid

As you may see in the examples the XML structure is quite simple:

<cities>
<city>

<home>URL of the Ipsojobs city home</home>
<rss> URL of the RSS of that city</rss>
<title>Title of the city, for example “Badalona, Spain”</title>
<current_posts>Number of posts active in that city</current_posts>
<lat>Latitude of the city</lat>
<lng>Longitud of the city</lng>
<posts>
<post>
<title>Title of the job posting</title>
<urlnice>URL of the job posting</urlnice>
</post>

</posts>
</city>

</cities>

You can use the API, just pointing to the URL http://www.ipsojobs.com/api/nearby.php with two parameters, lat and lng, as in this example.

http://www.ipsojobs.com/api/nearby.php?lat=40.416706&lng=-3.703270

If you think this API is nonsense, we’ve already our FIRST IMPLEMENTATION, check this site:

http://www.esofid.com/

This is an organization of “Language Schools” all around Spain, if you search in the top box, look for “Barcelona” for example, you will see all the “Official Language Schools in that area”, then if you click on one of them, you will see three tabs, the first are the basic info of the School, but the second contains a Panoramio’s picture integration and the Jobs near that location with the help of the API we’ve just launched, see the picture.

Esofid, Language Schools Integrated with Ipsojobs API

Nice, uh ?

Thanks to Raúl for the idea and the first integration !

Comments

Nearby Cities

Nearby citiesWe’ve just added an interesting new feature.

In order to increase the exposure of the job postings of a particular city, we have created the ‘Nearby cities‘ concept. Just below the ‘categories’ box, there’s a new section called ‘Nearby cities’ which shows the nearest cities given your current city selection. We’ve constrained the distance of nearby cities to 70 Km approximately.

So users have more options to discover easily good places to work near their city. If it’s good for users, it’s good for us!

Comments

New Year and new enhancements!

Happy New Year to everyone!

It’s seems than the Xmas and New Year break is almost over, so here we go again with renewed energy. As you might have seen, we’ve added two new languages: Nederlands and Portuguese. The translation work has been done by Tina and Jaime, ipsojobs administrators. Our most sincere thanks!

We’ve been tweaking also the URLs of a single post in order to make them SEO-friendly, so the title of the job posting appears also in the URL.

Finally, we’ve published an internal city ranking in order to inform to city administrators what is their relative performance, see screenshot below. The goal of each administrator should be to stay in the green zone as long as possible.

Green Zone

It’s Magi time, so if you’d like to see any additional feature on Ipsojobs… just let us know! :-)

Comments (4)

« Previous entries

© Omatech