Advertisement

Codility - Is it reasonable?

Started by September 17, 2010 01:33 PM
54 comments, last by way2lazy2care 13 years ago
Quote:
Original post by Extrarius
Machine-scored coding tests might prevent the company from wasting time on bad candidates, but it'll also prevent the company from getting the best of the best. Judging efficiency without listing it as a grading point sounds like a bad idea to me - you'll end up paying more per unit work due to premature optimization, associated debugging, etc.

IME, it is far better to use open-ended pre-screen questions that are judged by humans. For example, "Describe what happens when you open a URL in your browser." This works very, very well for my employer.


You're overreaching. Look at the free sample. Algorithmic correctness, protection against overflow, and appropriate big-O design are not exactly stretches for a problem this small. These are not heavy changes that you need to fix afterward for "optimization". If you don't do these things naturally in your coding, frankly, you're not qualified. If, for example, you're so into saving memory, why not just use a char or short? Why pick int? Use a long long and avoid most overflow issues.
Amateurs practice until they do it right.Professionals practice until they never do it wrong.
Quote:
Original post by TheBuzzSaw
Quote:
Original post by Extrarius
Machine-scored coding tests might prevent the company from wasting time on bad candidates, but it'll also prevent the company from getting the best of the best. Judging efficiency without listing it as a grading point sounds like a bad idea to me - you'll end up paying more per unit work due to premature optimization, associated debugging, etc.

IME, it is far better to use open-ended pre-screen questions that are judged by humans. For example, "Describe what happens when you open a URL in your browser." This works very, very well for my employer.


You're overreaching. Look at the free sample. Algorithmic correctness, protection against overflow, and appropriate big-O design are not exactly stretches for a problem this small. These are not heavy changes that you need to fix afterward for "optimization". If you don't do these things naturally in your coding, frankly, you're not qualified. If, for example, you're so into saving memory, why not just use a char or short? Why pick int? Use a long long and avoid most overflow issues.


long long is not (yet) in the C++ standard, and even if it was, how would that help for 1000000-bit numbers?
Advertisement
Well, in that sample problem you need an integer type that can hold INT_MAX*maximum size of a vector (because the vector itself holds ints, which cannot be arbitrarily big). I don't think long long is guaranteed to be that big, but in practice it should be, even in an automated test.

This just goes to show, though, that the Codility test suffers from under-specification, and this type of number overflow is exactly the type of thing that might slip you up on the first run through. I'd claim that even a really good programmer/debugger could slip up at that point (in fact, depending on how long long is defined, which I don't know precisely, it's quite possible that you cannot implement a perfect solution in C++ without rolling your own ad-hoc big number class, which would be just silly for that kind of test), but you would immediately know what to fix when you get a problem in the test results.

So you risk losing pretty good candidates based on random chance.

That's not to say that prescreening tests to weed out the obvious duds is a bad idea, nor that there is a prescreening test that is perfect. It's just something to keep in mind.

Edit: I do agree that not going for the O(n) solution immediately would make me skeptical of the skills of a programmer. I'm more worried about the fact that this kind of test doesn't allow a programmer to demonstrate her ability to recover from silly mistakes or from overlooking some corner case the first run through - I value this ability pretty highly, because mistakes do happen all the time.
Widelands - laid back, free software strategy
Quote:
Original post by mkuz
Test must present verification results for the same test cases which will be given in the report, so user can improve his code before submission.


That it doesn't do this is for me the real issue here.

I can see some use in not initially specifying performance restraints to see what the programmer's natural inclination is. But I can see no use at all in disallowing the programmer from improving his program before submission by not allowing him to see the full feedback from the program (assuming I'm interpreting mkuz correctly--I haven't used Codility myself).

It seems like very useful information to know, for instance, that someone did not initially write optimal code but, upon realizing that his program was too slow, was able to improve its performance to an acceptable level. That's much more useful than just knowing that he didn't initially write optimal code.
Quote:
Original post by nilkn
Quote:
Original post by mkuz
Test must present verification results for the same test cases which will be given in the report, so user can improve his code before submission.


That it doesn't do this is for me the real issue here.

I can see some use in not initially specifying performance restraints to see what the programmer's natural inclination is. But I can see no use at all in disallowing the programmer from improving his program before submission by not allowing him to see the full feedback from the program (assuming I'm interpreting mkuz correctly--I haven't used Codility myself).

It tells you if you passed the tests, your score, your estimated big O, and which tests you passed/failed (with rough descriptions of the tests).
Quote:
It seems like very useful information to know, for instance, that someone did not initially write optimal code but, upon realizing that his program was too slow, was able to improve its performance to an acceptable level. That's much more useful than just knowing that he didn't initially write optimal code.

Why? Its an interview question. You should do the absolute BEST you can on it. Show your skills. He didn't, he failed.

Furthermore, does your end user tell you EXACTLY how they're going to use your software? No. That's part of the point of tests like these. You have the constraints of its usage, how it should behave. The performance question is NOT something an end user can tell you. They can say "I want to click button X and have it do Y," they can't say "I want to click button X I want it to do Y within O(n) where n is the number of items I've selected."

Typically by the time an interview candidate has gotten to the point where I'm talking to them, they've generally passed some form of a test much like this, either in house or from similar sites. Often times we'll require a code sample that demonstrates their effective knowledge, but that's usually collected some point after pre-screening such as Codility.

In time the project grows, the ignorance of its devs it shows, with many a convoluted function, it plunges into deep compunction, the price of failure is high, Washu's mirth is nigh.

The size of the inputs is an important part of algorithm selection. Thus, input sizes should always be given.

I know three ways to implement a Fibonacci calculator. Recursively (O(φ^n)), Iteratively O(n), and Iteratively with Repeated Squaring of the matrix [0,1;1,1] O(lg(n)) time.

The problem is that the O notation hides constant factors. The O(lg n) algorithm is slower than the O(n) algorithm for small input values.

For a more extreme example, take the all-pairs-shortest-path problem. I can solve it in O(n^3) or O(n^2.37*lg(n)) time. The second algorithm is little o of the first and thus infinitely faster for sufficiently large n. However, the constant factor is so large that it is useless. Using a modification of Strassen we could find middle ground at O(n^2.80*lg(n)). But which algorithm we select depends entirely upon the size of the input.

Thus, IMO, it is unreasonable to ask for a algorithm without giving the size of the inputs the problem will be run on (in most cases).

Thank you for your time,
Arrummzen
Advertisement
I just tried the demo, and was given a problem to find the equilibrium index of a sequence of numbers. The list of numbers given in the example is quite short (6 items) but I would implement this as a binary search starting in the middle. But there are a few glaring issues, however, that I can see

1) how many numbers am I expected to work with? the examples give 6. But it might be in the millions.
2) What are the performance requirements? Should I implement the most basic and reliable method (serial search) or is it performance critical?
3) I have only 1/2 an hour to do this in. This isn't enough time to write tests, documentation, or any of the other things that one does in the real world
4) There is no human to ask questions of. This is the biggest fault. I have no idea what the arbitrary requirements are, and nobody to ask about them.

So all in all, I can say that this is worse than an interview because there is nobody to ask questions of. Sadly, a lot of PR people and a lot of managers are too inept at their jobs to realise this; so a lot of decent programmers are likely to fail this test.

At a real interview I would try to impress my potential employer with my knowledge of all the possible implications, and with a variety of solutions.

Some of the early responses to the OP revolved around "Life is not fair, job interviews are not fair, so its OK for this not to be fair either". By that logic, its OK for me to cheat in the test because other people cheat. In fact, aside from lucking out, I cant see how anybody could pass this test without cheating.

[edit]

I wish I had completed the test and then came here to post about it... I was going to look at a variety of tests and post back with a more considered opinion; but it wont let me do that. You get one demo only. Obviously this is so that people don't repeatedly use the demo to test people.

p.s. Upon reading the assignment again, I found that it said "assume the array may be very large. This gives me some indication of the performance implications, but I would still ask a human interviewer for some estimates as to the range of sizes.



[Edited by - speciesUnknown on September 18, 2010 6:40:13 AM]
Don't thank me, thank the moon's gravitation pull! Post in My Journal and help me to not procrastinate!
Quote:
Original post by Prefect

This just goes to show, though, that the Codility test suffers from under-specification, and this type of number overflow is exactly the type of thing that might slip you up on the first run through.
I agree. Especially since agile proponents evangelize sloppy programming that is then incrementally fixed. But someone who is capable of writing provably correct code will be able (might not like) to transition to different practices. The reverse isn't true.

Quote:
I'd claim that even a really good programmer/debugger could slip up at that point (in fact, depending on how long long is defined, which I don't know precisely, it's quite possible that you cannot implement a perfect solution in C++ without rolling your own ad-hoc big number class, which would be just silly for that kind of test), but you would immediately know what to fix when you get a problem in the test results.

Testing can only prove presence of a bug - it cannot prove absence of bugs.

Especially when coding in C or C++, knowing ranges of integers is very important.

Quote:
So you risk losing pretty good candidates based on random chance.

Employer gets all test results. They can choose whether numeric overflow matters to them. Many are likely to determine only perfect scores matter.

But the undertone of this statement is that programmers which will pass with 100% will be generally somehow inferior to those that scored just slightly less.

These are coding positions. Employers do not care about long-term viability, they need code written. Very few employers have long-term career development, and coder's tenure will likely be between 1 and 2 years. So they need the best programmers they can get and such test definitely beats the normal IT interviews.

In the end, it's up to employer. They see all the results, all the ratings and source code.

And while it may sound unfair, I have worked with programmers who write straight perfect code. No rewriting, no trial and error, no "it might work". And none of them was some nerdy shut-in. If anything, for them programming was more of a hobby while they moved upwards into fields like finance. Occasionally I felt really frustrated at how good they were and how easy it was for them.

But as a job seeker one must always remember that programming has zero barriers to entry. So until faced with such people it is indeed easy to imagine that just hacking something together is the best possible way.

Every year up to 1 million people graduate from CS. If only 1% of them are that good, that is a lot of competition. Just something for job seekers to consider. I mean *really* consider. And it will get much worse over the next 10 years.

Still, would you rather fail on programming test, where you can honestly improve even on your own or silently fail on Meyer-Briggs or MMPI2 evaluation due to some rating mismatch. Or be rejected due to imperfect credit rating.
Quote:
Original post by Antheus
Still, would you rather fail on programming test, where you can honestly improve even on your own or silently fail on Meyer-Briggs or MMPI2 evaluation due to some rating mismatch. Or be rejected due to imperfect credit rating.


I'd definitely take failing the programming test. Around last year, I finally got an interview with pretty much my dream company. I ended up taking a programming test such as the one listed. I assume I passed it, as I got another interview afterwards, and assume I passed the next few. Eventually I found out the company got down to 12 candidates, and I ended up failing my last interview, thus not getting the job. The interview I failed wasn't a programming interview. It was just a series of questions of what games I enjoyed, what games I didn't, and other questions about my hobbies. At this time in my life, I was in a job that I hated, so just the fact that this opportunity came around was undescribable. Trust me, I would have rather failed knowing that I wasn't good enough technically.

But about the test, I can't say they're perfect, but regardless of that, you probably should have prepared for it. Yes, you can't know what questions they're going to ask, but enough resources exist from people who have previously taken it that could have given you advice such as performance, large numbers, etc.

As others have mentioned, there are a wide range of skills, and probably plenty of people who passed the test. So while the test may not be perfect, as it is possible through bad luck to eliminate a few excellent candidates, it's good enough to filter down to decently qualified candidates. Besides, that's why post interviews exist, to filter down even further.

Companies still might not always make the right hiring decisions, even with these tests, and interviews. Failures exist at all levels. But a good developer will still find a way of getting a job regardless of a failed interview or two.
Quote:
Original post by Obscure
Quote:
Original post by mkuz
In description Codility did not mention that they score not only correctness, but also program performance.

Of course. Employers are looking for staff who will do their best work all the time, not just when they are explicitly asked. You're interviewing/testing for a job; it should be obvious that you need to do the most efficient solution, not just he minimum work necessary to solve the problem.
The most efficient solution is in most cases not the most desirable one; readability nearly always trumps performance. Unless it's obvious from the job description that you're going to be writing a lot of performance critical code, even an overly naive but perfectly clear solution is better than a complicated one.

It isn't about someone not doing their best, but about the value of "best" being undefined.

This topic is closed to new replies.

Advertisement