CategoryProgramming

Image Recognition Tool

As I’ve recently got quite into machine learning tools I’ve written a small GUI based tool that uses the PyBrains library to ‘learn’ a common theme from a folder of training data. To test new images, drag and drop them onto the tool to get a percentage of similarity to the test data.

The tool is written using Python, PyBrains, Glade2 and PyGTK so you need those libraries available.

The tool can be downloaded here – image_recognition_tool

A presentation I made for work that is an introduction to machine learning. Notes included with power point slides. Machine Learning

Image Recognition with PyBrains

After recently completing the Machine Learning course from Stanford University on Coursera I’ve been preparing to give a small introduction to machine learning at work. Part of that is showing some demos of machine learning tools.

I made a character recognition neural network using the PyBrains Python library, it’s a great library and very fast but the documentation is very poor and examples are hard to come across. With enough digging I managed to put together something very simple and short.

In this example it reads in small PNG files of letters, extracts all of the pixel values and creates a 1D array of the values, this is used to train the neural network through back propagation. I test the network on one of the inputs. Each input is classified with a number in the addSample function, this takes the flattened array and a number (unfortunately it does not take a string as a classification). If you run the application you will see that, for example when using b.png as a test, it will return a value close to 2.

You can download the images I used here – Machine Learning Training Characters.

 

Ceaser Cipher Cracker

Another programming challenge from work to solve Ceaser ciphered sentences and return the correct shift value. Automatic solving of the cypher is easy enough but the hard part comes to automatically detecting if the resulting shifted sentence is English. I have posted before about detecting English using ngrams and used a similar process here.

Here is the code which contains the four test cases.

C# DateTimePicker in PropertyGrid

Property grids are great for quickly giving users an interface for interacting with objects in WPF but customising them is not obvious. Luckily it is possible by extending the UIEditor class to launch custom GUI controls.

In this example I wanted to customise the default DateTime handler of PropertyGrid to include hours, minutes and seconds as unfortunately it only shows the date by default. The following class shows how to extend UIEditor in order to launch a pop-up DateTimePicker when the user selects the item.

Then in order to add that to your property grid, add the following to the DateTime item in the class used by the PropertyGrid –

The key part here is the EditorAttribute.

Update

Here’s a great article explaining custom UITypeEditors in more detail – http://bobpowell.net/TypeEditors.aspx

Reblog: Python Minidom and Whitespace

A good tutorial providing fixes for printing of XML using Python and Minidom. Specifically fixing added whitespace and the conversion of characters into HTML entities.

The Python Subprocess module is a highly flexible way of running background processes in Python. Here are two functions that show how to run external shell commands, one will wait for the command to return and the other runs in the background allowing the controlling script to stop it as required.

Passing unsanitized input to these functions will not end well.

 

Generic CSV to Plot With Numpy and Python

Given multiple CSV files containing a timestamp column and some plottable values (int or float) this script will plot multiple columns of data by time.

Usage is as such – ./CSVPlot.py first.csv second.csv third.csv

 

 

Compare Lists in C# Unit Tests

A small thing that caught me out today, while writing a unit test for C# I wanted to check that a list was returned in the expected order and used Assert.AreEqual to compare them, this kept failing despite the two lists being exactly the same.

It turns out that there is a separate CollectionsAssert class that is to be used for lists or any form of C# collection and that is due to the List<T> class not overriding the Equals method, the AreEqual function will just check the references instead.

Here is an example of a working unit test comparing two lists.

 

 

Time Bound Python Queue

An extension of the standard Python queue that only stores elements for a given number of seconds before being removed. Useful if you don’t know the volume of data being added to the queue but need to limit it in some way.

There is no decrease in performance when removing items.

Without item removal 

Queue size: 10000

real 0m10.771s
user 0m0.217s
sys 0m0.028s

With item removal 

Queue size: 4659

real 0m10.765s
user 0m0.203s
sys 0m0.028s

 

SQLite Database to CSV

Wrote a Python script that can dump a SQLite database to CSV, it takes the column names as the header and then writes values below. It also has a –sumarise option that can create a one line CSV file of multiple lines in a database.

Does have the option to support XML output in the future but a format would need to be defined by the user.

 

Unit Testing and Code Coverage Results in Jenkins CI

Unit testing is great, code coverage is great and continuous integration servers are great; but what isn’t great is combining all three into something useful!

If you are coding in Java then code coverage is fantastically simple but what if you use C++ and GoogleTest and want the same level of graphification? Luckily it is also simple if you know how – so you’re in the right place.

The scenario here is that you have a C++ project being built on a Jenkins/Hudson continuous integration setup and you also use Google Test for your unit tests. The software required for this is GCC 4.7 which should contain a tool called gcov, this will create the actual coverage files for your executable but will require some additional linker and library options to be added to your build scripts or make files. The final piece of software required is called gcovr, this is a Python script that is designed to read the output files from gcov linked executables and convert it into an XML format that can be read by the Cobertura plugin in Jenkins/Hudson.

It is a good idea to add a separate make target for coverage as it requires the executables to be built with no optimisation and a lot of debug symbols, so not very useful for production environments.

Add the following to your CPPFLAGS –

And this to your LDFLAGS

So for example you could have a section in your makefile that looks like this (assuming the test target builds your unit test.) –

After you have run the coverage build target you then have to run the test executable, this generates .gcov files that describe which lines of code are covered and by what functions.

Then ensure that you can access the gcovr script and run it with the following arguments –

The –gcov_exlcude will remove any unnecessary parsing of files from the GoogleTest framework. The output is placed into the coverage.xml file which can then be accessed from Jenkins by adding a new Post-Build Action from the job configuration page.

© 2019 Acodemics

Theme by Anders NorénUp ↑