Google Code offered in: 中文 - English - Português - Pусский - Español - 日本語
With the bulk data uploader tool included in the SDK, Google App Engine makes it possible to upload your data from a CSV file and convert them into entities in your application's datastore. You can even specify that the entities be searchable.
In this article, we'll create a program that searches our datastore for entities based on a keyword. We will then show how to use the bulk data uploader to load searchable data onto the application.
First, let's write a program that searches the datastore based on a keyword argument and returns all the entities that match the given keyword:
import wsgiref.handlers from google.appengine.ext import webapp from google.appengine.ext import search class MainPage(webapp.RequestHandler): def get(self): # We use the webapp framework to retrieve the keyword keyword = self.request.get('keyword') self.response.headers['Content-Type'] = 'text/plain' if not keyword: self.response.out.write("No keyword has been set") else: # Search the 'Person' Entity based on our keyword query = search.SearchableQuery('Person') query.Search(keyword) for result in query.Run(): self.response.out.write('%s' % result['email']) def main(): application = webapp.WSGIApplication([('/', MainPage)], debug=True) wsgiref.handlers.CGIHandler().run(application) if __name__ == "__main__": main()
Great—however, our application doesn't yet contain any data! So let me now show you how to organize and upload some data into this application.
The bulk upload module is available in google.appengine.ext.bulkload
and will allow you to import your external data into your application. It reads
a CSV file, sends the data to your application, converts each line of data into
an entity, and stores the entity in the datastore. You may also specify that
the loader make entities searchable when it's uploading the data.
To use this module with our application, we create a loader that describes
the data model. Then, we add the loader's handler to our app.yaml
file. Finally, we use tools/bulkload_client.py
to load the data
into the datastore.
Let's create our bulk load handler in a file named myloader.py
.
The code below specifies for the bulk data loader the CSV format, as well
as that entities are to be indexed.
This handler creates a new class that inherits from the Loader
class and overrides the HandleEntity
method. The new
HandleEntity
method passes entities through the indexer.
from google.appengine.ext import bulkload from google.appengine.api import datastore_types from google.appengine.ext import search class PersonLoader(bulkload.Loader): def __init__(self): # Our 'Person' entity contains a name string and an email bulkload.Loader.__init__(self, 'Person', [('name', str), ('email', datastore_types.Email), ]) def HandleEntity(self, entity): ent = search.SearchableEntity(entity) return ent if __name__ == '__main__': bulkload.main(PersonLoader())
Our CSV data for this entity model will look like this:
John, john@example.com Mary, mary@example.com William, william@example.com ...
Since our POST request endpoint will be at the /load URL, we must add the
handler information for myloader.py
to our
app.yaml
file:
- url: /load script: myloader.py login: admin
Now that we've set up our application and data loader, we can deploy the app using dev_appserver.py. Once the app is running, we're ready to send the CSV data to our app to have it loaded in the datastore.
The bulkload client is located in the SDK
(tools/bulkload_client.py
). This is the Python script that will
read in a CSV file, then batch and send the data to your application. We can
use the following command to load a CSV file (people.csv
)
onto a development server:
./bulkload_client.py --filename people.csv \ --kind Person \ --url http://localhost:8080/load
There you have it. Now, when we hit the URL
http://localhost:8080/?keyword=William
, we will see the following in our browser:
william@example.com