Tuesday, 22 January 2013

Integrate Facebook logins in your Android app


One of the easiest ways to get users to log in to your application is by letting them use theirFacebook account to do so. In my experience, most people will opt to re-use their Facebook login, which I can understand because I don’t always want to create a new account for every app I use.
I will walk you through the process of integrating Facebook logins in your Android application. The setup involved is fairly straightforward. I suggest that you first check out the sample source code for this article that I posted on GitHub.

Create a Facebook app

The first step is to create an application in Facebook. You can do this by going to the Developer App once you are logged in to Facebook.
From the developer portal, click the + Create New App button at the top right. Give the app a name, agree to their terms by clicking the checkbox, and then click the Continue button. Complete the Captcha test and then click the Continue button.
You will then be taken to a screen where you can edit all the details for your Facebook app. Click the Mobile link in the left-hand nav, as this is the only section we will be interested in. Take note of the App ID, because you will need this later when coding your Android app.
Next, you will need to generate a Key Hash for the application. For debugging, if using Eclipse, you will want to generate this Key Hash using the Android debug key. When you are ready to publish your app, you will need to generate a Key Hash for your signing keys and update this value in Facebook before your signed app will work.
To generate a Key Hash for your debug keys, first open a command prompt (Windows) or terminal (Mac). Navigate in the command-line to the directory where your Android debug keystore is stored. On Windows it will be “C:\Documents and Settings\<User>\.android” and on a Mac it will be in “~/.android”.
Once you are in the “.android” directory, run the following command. When it prompts you for a password, type android and hit Enter:
keytool -exportcert -alias androiddebugkey -keystore debug.keystore | openssl sha1 -binary | openssl base64
Copy the value printed in the command-line that ends with an “=” and paste it in the Key Hash field in Facebook. Then click the Save Changes button.

Copy the necessary Android assets

Now you are ready to start integrating Facebook login in your Android application. Before you do this, you need to make sure you have installed the official Facebook APK, which you’ll find on the Android Market.
Copy the /src/com/facebook folder of my sample code into the /src/com folder of your Android project. Also copy the facebook_icon.png image file from the drawable folder of my sample into the drawable folder of your project. Then, edit the FbDialog.java file to change the package name of the following line:
import com.kmiller.facebookintegration.R;
Next, copy /src/com/kmiller/facebookintegration/SessionStore.java and /src/com/kmiller/facebookintegration/Util.java into your project and fix the package name in each file.

Authorize/query Facebook

The Login.java file of my sample application contains all the logic for authorizing the user and to query Facebook afterwards. Here’s a look at the various sections of that file to examine the responsibility of each part.
mAPP_ID
public static final String mAPP_ID = "<your_app_id>";
You should replace this value with the App ID of the Facebook app you created in the beginning.
onCreate
public void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.main);
((Button)findViewById(R.id.LoginButton)).setOnClickListener( loginButtonListener );
SessionStore.restore(mFacebook, this);
}
The call to SessionStore.restore() will restore a previous Facebook session (if it was stored properly in the first place) so you won’t have to re-authorize the user.
onActivityResult
protected void onActivityResult(int requestCode, int resultCode, Intent data) {
mFacebook.authorizeCallback(requestCode, resultCode, data);
}
This method gets called after the user is returned from Facebook authorization. The one line of code in this method simply passes the results of onActivityResult to the mFacebook object’s authorizeCallback method.
loginButtonListener
private OnClickListener loginButtonListener = new OnClickListener() {
public void onClick( View v ) {
if( !mFacebook.isSessionValid() ) {
Toast.makeText(Login.this, "Authorizing", Toast.LENGTH_SHORT).show();
mFacebook.authorize(Login.this, new String[] { "" }, new LoginDialogListener());
}
else {
Toast.makeText( Login.this, "Has valid session", Toast.LENGTH_SHORT).show();
try {
JSONObject json = Util.parseJson(mFacebook.request("me"));
String facebookID = json.getString("id");
String firstName = json.getString("first_name");
String lastName = json.getString("last_name");
Toast.makeText(Login.this, "You already have a valid session, " + firstName + " " + lastName + ". No need to re-authorize.", Toast.LENGTH_SHORT).show();
}
catch( Exception error ) {
Toast.makeText( Login.this, error.toString(), Toast.LENGTH_SHORT).show();
}
catch( FacebookError error ) {
Toast.makeText( Login.this, error.toString(), Toast.LENGTH_SHORT).show();
}
}
}
};
When the user clicks the “Login with Facebook” button in the sample application, this is the OnClickListener that fires. It starts by checking if there is a valid Facebook session. If there is not a valid Facebook session, mFacebook.authorize is called. Otherwise, it queries Facebook using mFacebook.request(”me”) to get the user’s first and last name.
LoginDialogListener
public final class LoginDialogListener implements DialogListener {
public void onComplete(Bundle values) {
try {
//The user has logged in, so now you can query and use their Facebook info
JSONObject json = Util.parseJson(mFacebook.request("me"));
String facebookID = json.getString("id");
String firstName = json.getString("first_name");
String lastName = json.getString("last_name");
Toast.makeText( Login.this, "Thank you for Logging In, " + firstName + " " + lastName + "!", Toast.LENGTH_SHORT).show();
SessionStore.save(mFacebook, Login.this);
}
catch( Exception error ) {
Toast.makeText( Login.this, error.toString(), Toast.LENGTH_SHORT).show();
}
catch( FacebookError error ) {
Toast.makeText( Login.this, error.toString(), Toast.LENGTH_SHORT).show();
}
}
public void onFacebookError(FacebookError error) {
Toast.makeText( Login.this, "Something went wrong. Please try again.", Toast.LENGTH_LONG).show();
}
public void onError(DialogError error) {
Toast.makeText( Login.this, "Something went wrong. Please try again.", Toast.LENGTH_LONG).show();
}
public void onCancel() {
Toast.makeText( Login.this, "Something went wrong. Please try again.", Toast.LENGTH_LONG).show();
}
}
If the login is successful, the onComplete method of the DialogListener will fire. At this point, you may query Facebook for the user’s information since you now have a valid session. You should also call SessionStore.save() to save this current session to keep from re-requesting authorization in the future.
If the login fails, the onFacebookError, the onError, or the onCancel method will be fired, and you can handle it appropriately. Your app should now be capable of the basics regarding Facebook authorization.
If you’re interested in me expanding the sample app to show some of the deeper Facebook integration that is possible, please let me know in the discussion.

Monday, 7 January 2013

OCR training


Adding New Fonts to Tesseract 3 OCR Engine

Tesseract is a great and powerful OCR engine, but their instructions for adding a new font are incredibly long and complicated. At CourtListener we have to handle several unusual blackletter fonts, so we had to go through this process a few times. Below I've explained the process so others may more easily add fonts to their system.
The process has a few major steps:

Create training documents

To create training documents, open up MS Word or LibreOffice, paste in the contents of the attached file named 'standard-training-text.txt'. This file contains the training text that is used by Tesseract for the included fonts.
Set your line spacing to at least 1.5, and space out the letters by about 1pt. using character spacing. I've attached a sample doc too, if that helps. Set the text to the font you want to use, and save it as font-name.doc.
Save the document as a PDF (call it [lang].font-name.exp0.pdf, with lang being an ISO-639 three letter abbreviation for your language), and then use the following command to convert it to a 300dpi tiff (requires imagemagick):
convert -density 300 -depth 4 lang.font-name.exp0.pdf lang.font-name.exp0.tif
You'll now have a good training image called lang.font-name.exp0.tif. If you're adding multiple fonts, or bold, italic or underline, repeat this process multiple times, creating one doc → pdf → tiff per font variation.

Train Tesseract

The next step is to run tesseract over the image(s) we just created, and to see how well it can do with the new font. After it's taken its best shot, we then give it corrections. It'll provide us with a box file, which is just a file containing x,y coordinates of each letter it found along with what letter it thinks it is. So let's see what it can do:
tesseract lang.font-name.exp0.tiff lang.font-name.exp0 batch.nochop makebox
You'll now have a file called font-name.exp0.box, and you'll need to open it in a box-file editor. There are a bunch of these on the Tesseract wiki. The one that works for me (on Ubuntu) is moshpytt, though it doesn't support multi-page tiffs. If you need to use a multi-page tiff, see the issue on the topic for tips. Once you've opened it, go through every letter, and make sure it was detected correctly. If a letter was skipped, add it as a row to the box file. Similarly, if two letters were detected as one, break them up into two lines.
When that's done, you feed the box file back into tesseract:
tesseract eng.font-name.exp0.tif eng.font-name.box nobatch box.train.stderr
Next, you need to detect the Character set used in all your box files:
unicharset_extractor *.box
When that's complete, you need to create a font_properties file. It should list every font you're training, one per line, and identify whether it has the following characteristics: <fontname> <italic> <bold> <fixed> <serif> <fraktur>
So, for example, if you use the standard training data, you might end up with a file like this:
eng.arial.box 0 0 0 0 0
eng.arialbd.box 0 1 0 0 0
eng.arialbi.box 1 1 0 0 0
eng.ariali.box 1 0 0 0 0
eng.b018012l.box 0 0 0 1 0
eng.b018015l.box 0 1 0 1 0
eng.b018032l.box 1 0 0 1 0
eng.b018035l.box 1 1 0 1 0
eng.c059013l.box 0 0 0 1 0
eng.c059016l.box 0 1 0 1 0
eng.c059033l.box 1 0 0 1 0
eng.c059036l.box 1 1 0 1 0
eng.cour.box 0 0 1 1 0
eng.courbd.box 0 1 1 1 0
eng.courbi.box 1 1 1 1 0
eng.couri.box 1 0 1 1 0
eng.georgia.box 0 0 0 1 0
eng.georgiab.box 0 1 0 1 0
eng.georgiai.box 1 0 0 1 0
eng.georgiaz.box 1 1 0 1 0
eng.lincoln.box 0 0 0 0 1
eng.old-english.box 0 0 0 0 1
eng.times.box 0 0 0 1 0
eng.timesbd.box 0 1 0 1 0
eng.timesbi.box 1 1 0 1 0
eng.timesi.box 1 0 0 1 0
eng.trebuc.box 0 0 0 1 0
eng.trebucbd.box 0 1 0 1 0
eng.trebucbi.box 1 1 0 1 0
eng.trebucit.box 1 0 0 1 0
eng.verdana.box 0 0 0 0 0
eng.verdanab.box 0 1 0 0 0
eng.verdanai.box 1 0 0 0 0
eng.verdanaz.box 1 1 0 0 0
Note that this is the standard font_properties file that should be supplied with Tesseract and I've added the two bold rows for the blackletter fonts I'm training. You can also see which fonts are included out of the box.
We're getting near the end. Next, create the clustering data:
mftraining -F font_properties -U unicharset -O lang.unicharset *.tr
cntraining *.tr
If you want, you can create a wordlist or a unicharambigs file. If you don't plan on doing that, the last step is to combine the various files we've created.
To do that, rename each of the language files (normproto, Microfeat, inttemp, pffmtable) to have your lang prefix, and run (mind the dot at the end):
combine_tessdata lang.
This will create all the data files you need, and you just need to move them to the correct place on your OS. On Ubuntu, I was able to move them to;
sudo mv eng.traineddata /usr/local/share/tessdata/
And that, good friend, is it. Worst process for a human, ever.