Browser request serialisation
placefell
[info]ext2366
It's beginning to be quite well known that if you want to load a lot of objects from the same web server, then they'll download faster if you give it more than one domain name (but not too many or you get killed by waiting for DNS requests), as web browsers have a limit on how many items they will download in parallel from the same host.

There are other limits to what the browser will download in parallel, and there's now a really cool tool Cuzillion which allows you to see how this works in browsers without having to produce the pages yourself, and without having anything else affecting the page load time.

I did a bit of experimenting with 10 images taking 2s each to load, and found a weird feature of firefox (3.5.2). It does load the first 6 images in parallel, however the next 4 are loaded one after another. Rather than as another block. You can try it yourself here. Thus it takes 10s (2+2+2+2+2) rather than the 4s (2+2) I would expect.

Attached is the screenshot of firebug showing this in action. Does this happen in other browsers?

Parallel requests

(no subject)
placefell
[info]ext2366
Some boxed-sets have multiple CDs. Imagine you're creating an API that tells you which CD a track is on. What should it be called?

Poll #1445800 CD numbering
Open to: All, detailed results viewable to: All, participants: 5

What should the API call the element representing which CD a track is on?

View Answers

discNumber
3 (60.0%)

diskNumber
0 (0.0%)

Other
2 (40.0%)

If Other, what should it be?


Collaborative flash development
placefell
[info]ext2366
Does anyone write flash apps as part of a team? It seems that the "source" is a big binary mess of an FLA file. This means that you cannot merge changes made by different developers, basically limiting each file to only one developer at a time.

There also appears to be no command-line compiler, so you cannot have your auto build system compile the flash, or your deployment script. Instead the developer has to commit the SWF file as well as the source. Something that you really shouldn't be doing.

Flex is a lot better in both regards. There's a command-line compiler, and you write .as files like a real language.

Am I missing something? Is there a command-line compiler?

Cpan fail
placefell
[info]ext2366
I tried to install a perl module today. Ok my machine is old, but that's why I'm using cpan rather than using a package that comes the OS. First I discovered readline wasn't working in cpan. That's an easy fix right, you just do install Bundle::CPAN. Nope, it fails.

Recursive dependency detected:
    Bundle::CPAN
 => Test::Harness
 => A/AN/ANDYA/Test-Harness-3.17.tar.gz
 => File::Spec
 => S/SM/SMUELLER/PathTools-3.30.tar.gz
 => Scalar::Util
 => G/GB/GBARR/Scalar-List-Utils-1.21.tar.gz
 => Test::More
 => M/MS/MSCHWERN/Test-Simple-0.92.tar.gz
 => Test::Harness.
Cannot continue.


Now, I'm not dissing people who write perl doing testing. However if you're writing a test harness, please make sure that you're not needed for any of your dependencies to compile. What am I supposed to do?
Tags:

Persistent urls without page reloads
placefell
[info]ext2366
How many times has someone passed you a url to google maps, and you've clicked on it to find you're at a completely different place to where they intended you to be? Quite often you'll be at the centre of a town, or at their house. This is because they've copied the URL out of the address bar of their browser thinking that it will take you to what they can see. This is a very sensible way for them to expect it to work. It is how they've been trained to use the web. However, in order to share a link on google maps you need to click on the "share this" button, and share the url that gives you. When you click the button, it generates a URL that encodes what you're seeing.

The browser URL getting out of sync with what you're seeing is a problem with all ajax applications - they want to update the content of the page, without triggering a page reload and all of the slowdown that entails. Sometimes it isn't just slowdown. On the site our users are playing music, and they don't want it to stop whenever they browse around to find some more to listen to. This problem isn't new to AJAX though, it's been true since the days of frames - one of the reasons that frames didn't catch on was that your browser's urlbar only contained the location of the frameset, which will point to the front page of a site, and not to the page that you're currently looking at.

We "solve" this using a complicated system of javascript. Users without javascript can't listen to music using our play/javascript based player anyway, so will not get music interrupted.

If you browse around the site, you'll see that your url ends up in the form
  http://www.we7.com/#/music/blues

Can you see the "#" in there? That's an anchor in URL speak, and tells the browser that the content after it isn't part of the address of the page to request, but a location within that "page" where the content is found..

What about anchors that are used as anchors?

For example, http://www.we7.com/help/using-we7#playing-music

Most of the time this doesn't work. Instead you just get taken to the top of the page. It does work (it some browsers) for links within the site, where we intercept links in this form, and change the javascript that loads the new location into the content frame, and once the page has loaded, we then scroll the window to the location of that element (well, not quite, as that would put that element under the player, so we in fact have to scroll to just above the element, so that the element is at the top of the part of the page that you can see).

But what if a non-javascript user (or googlebot) shares a link with a javascript user? Some sites make this just add the anchor to the end of the URL that you started on, so if someone comes to our site from a
google search for madonna, they're urls would be of the form
  http://www.we7.com/artists/Madonna?artistId=95352#/about/faq

This isn't a very nice url, as it implies to people that you're sharing it with that they'll see something about madonna, even though you're shareing the FAQ page, it also leaks which page you started from (like
in the google maps case), and for non-javascript people (and google bot if this is a link on a website) will take them to the (unintended) madonna page.

Instead, if your browser supports javascript, and it gets taken to a url that isn't the root, then it'll redirect you to the root with a cookie so that the browser knows what it's supposed to put in the visible part
of the page. this front page will then render the player frame and include the page you wanted to see.

This worked quite well, but had a problem - twitter's url parser will truncate urls at any ? after the # which means that http://www.we7.com/#/arist/Madonna?artistId=95352 becomes just
http://www.we7.com/#/arist/Madonna, which - unfortunately - our site will return a 404 to as it doesn't have the ID. The simple solution to this was to change the javascript so that it switched ? with ! in the urls in the browser, and back again when it made requests. We also had to add a server that would redirect urls with ! in, back to the ? form as people had realised that they could share
urls by just removing the "#/" from the middle.

Bingo, twitter was happy! However a lot of blogging software which didn't mind the ?s in the anchor doesn't like !. We've considered ~, but mail.app doesn't like that (WTF?? hasn't unix has
http://www.example.com/~foo/ for years?)

I think the only safe way would be to remap all the urls in the app so that there are no ?s at all and the parameters are embedded in the path, the way that rails made popular code>http://www.we7.com/artists/95352/Madonna</code> or even http://www.we7.com/artist/Madonna

Any better ways of doing it? Are there good sites that change the URL whilst navigating about using AJAX without this confusing system?

Does this make any sense, or is it just rambling? I should really post more often to get more practice.

Even the jvm gets synchronization wrong sometimes
placefell
[info]ext2366
The JVM has the following code in DeleteOnExit.

  static void add(String file) {
	synchronized(files) {
	    if(files == null)
		throw new IllegalStateException("Shutdown in progress");

	    files.add(file);
	}
    }


What would you expect to happen if files was null?

More... )

Share your login?
placefell
[info]ext2366
Here's a quick poll...

Some websites ask you for your login details for other sites. This is in order to do something on your behalf. Maybe it's facebook asking for your gmail account details so they can import your address book as a friends list, maybe it's a site that will email your twitter replies to you, or a website that wants to report your listening to last.fm for you. Would you enter your details onto a third-party website so they can access your account on your behalf?


Poll #1394269 Share your details
Open to: All, detailed results viewable to: All, participants: 6

Would you share your twitter details?

View Answers

I don't have a twitter account
1 (16.7%)

Sure, they can have my details for something cool
1 (16.7%)

No way are they getting access to my twitter
4 (66.7%)

Isn't that what OAuth is for?
0 (0.0%)

I use the same password for twitter as most sites, so they can probably login as me anyway :)
0 (0.0%)

How about your last.fm login?

View Answers

I don't have a scrobbler account
0 (0.0%)

Sure, they can use my details to scrobble for me
2 (33.3%)

No way
2 (33.3%)

Don't last.fm have an authentication API?
3 (50.0%)

I use the same password for last.fm as most sites....
0 (0.0%)



Discussion... )

Interesting filesystems hack
placefell
[info]ext2366
This is an interesting feature that I learned from our new sysadmin. It seems like it'll be dangerous until you realise which levels of abstraction it works at.

Simply put: you can mount the same filesystem read-write in two places. [1]

This means that when you are migrating a set of files (from /, but that's not important) onto a different partition you can copy to the new partition, mount it over where they were (and all the open files are still there, just hidden). When you've checked that it all works and want to recover all that space that's taken up with files that you can't get to, then you can mount the filesystem somewhere else (which doesn't have the other mount obscuring it), and delete the files.

It's suggested that prudent people touch a file before they delete, to check that they are about to delete the right set of files.

I thought - eww, that must make a mess, two sets of structures inodes etc in ram, getting out of sync, but no, it's one filesystem at the bottom, and two sets of paths mapping to that one filesystem. The power of abstractions!


[1] In linux, 2.4.something+

Currently listening to Cake - Comfort Eagle

Graze
placefell
[info]ext2366
Anyone else seen all the marketing for Graze?

They are designed to be perfect for office dwellers who want a healthy option for lunch without having to leave their desk. Clearly it's for those people who don't take a proper lunch break anyway. Now, I fall into that category. Normally I bring in left-overs from home, so I'm not the sort of person normally spending 4 quid/day at M&S on a salad though.

They charge you 3 quid/day and post you lunch. What you get is a cardboard box, about the size of a VHS tape, but slightly thinner so it'll fit though letter boxes. In this are 3 plastic containers, containing fresh fruit, dried fruit seeds and nuts. The exact mix can be tuned using their website but telling them your likes and dislikes.

The idea is that you eat it gradually during the day, rather than all at lunch time, and this is somehow better for your metabolism.

I found that I didn't get into it, not helped by the first box only coming at lunch time, but I found that I had finished it all by 2:30. I wasn't hungry. That's a good thing. Then I found that the box I'd eaten contained over 600 calories. Much more than the 400 you'd get from a carton of soup from tesco.

Anyway, I don't think it's for me, but if anyone wants to try, you can get your first box free, and the second half-price, by using the code: FYWNRC5. Their idea seems a good one, and it might work for you, but 3 quid/day sounds like quite a lot for someone who isn't already buying a sandwich every day. Oh, and you'll have to find another excuse to leave the office at lunch and get your fresh air.

No, really, SimpleDateFormat isn't thread safe.
placefell
[info]ext2366
Maybe the documentation for SimpleDateFormat isn't clear enough. It was certainly a bad design decision, but SimpleDateFormat isn't thread-safe. The parse() and format() methods are not thread-safe. They appear to store parts of the date in fields within the object, so if you parse dates in several threads at a time then you might get errors, and when you format in several threads you might get the wrong dates.

It's not like nobody has seen this before, yet still experienced Java programmers keep using single SimpleDateFormat objects from multiple threads. It should probably be one the list of things to teach programmers when they start a new job. Maybe I should have a commit hook to check that nobody uses SimpleDateFormat, and have a thread-safe wrapper.

Mapping optional-1-1 relationships in hibernate
placefell
[info]ext2366
We use Hibernate to map our database tables to java objects. There are many problems with hibernate, and I'll probably end up going into them later.

Javascript "Operation Aborted" in IE
placefell
[info]ext2366
Here's a more technical post, to balance out the blatant promotion.

Sometimes you write your flashy javascript to edit the page - maybe you want to embed a widget, or a google map - and it works fine in good browsers. Someone then tries to use IE, and it just sits in a sulk, and wont even render the page. It doesn't render an error page, just pops up a box saying "internal operation error". This is maddening. There's not a lot you can do to debug it.

The problem is that IE does not like if you add a dom node that is not a direct child of the document body from an inline script that isn't a direct child of the body element. It's something about the element you want to add it to not existing yet, but surely they could have provided a better error message. If you get this, move your scripts so that they're direct children of the body element, or created the dom element in a handler that gets called when the DOM has finished loading. See the dom:loaded event in prototype, or the jQuery ready event.


Currently listening to: Welcome to the Jungle - Guns and Roses

Like We7? Dislike the adverts?
placefell
[info]ext2366
(Edit: Public now it's Sunday)
Or do you just like free stuff!

In case anyone doesn't know, We7 allows you to listen to music in your web browser, for free (though it does require javascript and flash). Although it's free, you have to listen to a short (2-5 second) advert before each song you listen to.

We're planning a monthly subscription service where you don't get the audio adverts. Currently spotify charge you 9.99 for this. As part of this, we're doing a promotion with the Daily Star, starting this Sunday. If you want the music, without having to wait until Sunday, or buy the Daily Star, then you, my lucky readers can get a sneak peek at the code before it becomes properly public.

Go to we7.com/go/dailystar and enter the code 52G-WNS-W8H-B7B. Sorry it's not a nice, easy to remember one, but they generated a random one before I suggested just creating a short, memorable, code in the database. If you don't already have a we7 account, you'll need to create one before you can use it.

Read more... )
Tags:

HttpUnit slow? Are you returning a contentLength of zero?
placefell
[info]ext2366
Our integration tests have been taking a long time for ages. They start up tomcat, and then make a load of web requests to it. When I say a load, it's more like 60 because people have been loath to add new tests as running them is so slow. Each new test takes 3 seconds, which sounds a lot.

Why is it so slow? )
One day I'd like to have a comprehensive test suite based on selenium/webdriver, in the meantime I'm happy that the current rather useless tests don't take so long any more. More developers will run the tests, and will catch their mistakes before they commit rather than once they've started on the next task.

Currently Listening To: Tom Lehrer In Concert

Debian exim rejects valid characters in local parts
placefell
[info]ext2366
If you have %, $, !, ?, |, `, # or & in your email address it will be rejected at any relay using the default debian exim config, despite being valid according to RFC2822.

Does anyone use these characters in their addresses? If so, I suspect you're not getting all your mail.

The comments in /etc/exim4/conf.d/acl/30_exim4-config_check_rcpt say:
  # Non-alphanumeric characters other than dots are rarely found in genuine
  # local parts, but are often tried by people looking to circumvent
  # relaying restrictions. Therefore, although they are valid in local
  # parts, these rules disallow certain non-alphanumeric characters, as
  # a precaution.
<snip>
  # These ACL components will block recipient addresses that are valid
  # from an RFC2822 point of view. We chose to have them blocked by
  # default for security reasons.


Another quote, further down says:
  # Single quotes might probably be dangerous as well, but they're
  # allowed by the default regexps to avoid rejecting mails to Ireland.

I guess that is to allow paddy.o'connor@example.ie - I'm sure there are people other than the Irish that use 's in their names. I'm not sure I'd create a mail system that gave people addresses in them anyway, given the number of sites that wont allow you to use a - or +.

Having been annoyed at being told that my perfectly valid email addresses are not valid, I initially implemented our registration system to accept as much as possible. Unfortunately, this lead to it creating an account, but not sending you your activation email because the mail servers wouldn't accept it, and I've now added restrictions to reject such email addresses earlier, and provide a slightly better error message to the user "Invalid character in email" is a lot better than "Internal server error".

Is it really sensible to ship a default configuration that blocks a load of valid email addresses? It says that if you want your site to be more permissive you can just edit the configuration, but that wont help your email be delivered, as it requires all of the mail servers it passes through to accept your address.

Update: This is now debian bug 522807 let's see how long before it gets marked wontfix.

We7 Scrobbler
placefell
[info]ext2366
Woo! If, like me, you listen to music on We7, but want Last.fm to keep a record of what you listen to, you can now do this. Maybe you want to share it on your home page, maybe you want LJ to display what you're currently listening to.

You can use We7 Scrobbler. It requires you to be using Firefox as it's a Greasemonkey script. It works entirely client-side, and it doesn't pass your details to We7, straight to Last.fm from javascript on your browser.

This isn't an officially supported feature, it may break, it shouldn't cause your computer to explode when you run it, but even that isn't guaranteed. If you experience problems with the site after running it, it's probably best to disable Greasemonkey and try again.

For those people with whom I've discussed writing such a script, this isn't written by me, but I am using it at the moment.
Tags: ,

Finding useless indexes in Postgres
placefell
[info]ext2366
It's always hard to work out when you need to create an index for your database, my general rule is to create an index if there's any doubt, paying particular attention to columns that reference other items as a foreign key (as postgres at least has to find all the rows that reference a row that you change to check that they're still valid), and anything that people will look up items on.

However, some tables get a lot of writes (creates and updates), and few reads. These are quite often writing in time-critical sections of your code. These tables shouldn't have too many indexes, as there's the overhead to create the index.

There a very interesting bit of SQL to find unused indexes in postgres which will tell you which of these are a waste of time. Indexes that take a gig of space, have been written 10million times, and never read are probably not needed, and can be removed - until that point where your queries change, and you need them again.

Quartz cron trigger specs are not like cron
placefell
[info]ext2366
For some parts of our app, we use Quartz CronTrigger objects.

To specify when a task will run you give it a string telling it when to run. This string looks a bit like a crontab(5) entry. You think "That sounds like a good idea, it means that I don't need to learn a new way to specify these things. Now, look again at the CronTrigger documentation, it says "defined with Unix 'cron-like' definitions" - the important bit is "cron-like". CronTrigger has an extra field at the beginning for seconds. This sounds good. Imagine I wanted to run something at 11am on Sunday. In cron I would do
0 11 * * 7
and expect the equivalent in CronTrigger to be
0 0 11 * * 7
which makes sense. Unfortunately this would actually run on Saturday.

If it wasn't a format that looked like cron then people would be more likely to read the documentation for the spec, rather than just assume that it was the same. Please, if you're developing something don't make it "close to" another thing, either make it the same or different. Now, I don't mind adding extra features, but if something looks like a cron spec, please make it behave like one.
Tags: , ,

Java regular expressions and multi-lines
placefell
[info]ext2366
When using perl to match strings, you would quite often do something like:
  if ($message =~ /value/) { print $message }

In java regular expressions such as used by String.matches have to match the whole string, so you would do
  if (message.matches(".*value.*")) { System.out.println(message) }

Now, what happens if you do
  if ("foo\nsome value 5".matches(".*value.*")) { System.out.println("It matched!") }
?
Explanation... )
Tags: , ,

Java quiz (Generics)
placefell
[info]ext2366
Consider the following program. It performs magic, by converting an Integer into a String. Obviously this doesn't work, but what actually happens?
   1. class gen {
2. Object obj;
3. public gen(Object s) {
4. obj = s;
5. }
6. T test() {
7. return (T) obj;
8. }
9. }
10.
11. public class test {
12. public static void main(String args[]) {
13. gen z = new gen(new Integer(20));
14. System.out.println(z.test());
15. }
16. }

Poll #1366019 Exceptional!
Open to: All, detailed results viewable to: All, participants: 1

Which line is the exception thrown on?

View Answers

7
0 (0.0%)

13
0 (0.0%)

14
0 (0.0%)

Other
1 (100.0%)

None - no exception
0 (0.0%)

Spoilers... )
Tags: ,

Home