- DHCR & MapPLUTO Data Processing
- Creating Catchment Area Polygons For Local Tenants Rights Organizations
- Links to Code & Data
Am I Rent Stabilized? is a web application I conceptualized and designed that encourages New York City tenants to find out if their landlord may be illegally overcharging them for a rent stabilized apartment and if so, motivates them to take action. The development of the app was motivated by the lack of enforcement of rent regulation laws in NYC by the local and state government. It is an attempt at using open data as a prompt for civic action, rather than solely for visualization and analysis. The app first asks the user to input their address and borough, then checks it against a database of properties that are likely to have rent stabilized apartments. From here the app recommends a course of action and informs the user of their nearest tenants rights group so they can get help. The app features a responsive UI that is mobile friendly, and its content can be toggled to either Spanish or Chinese for non-english speakers.
The development of the app stems from a Freedom of Information Law request I made in the Fall of 2014 for the New York Department of Homes and Community Renewal’s list of rent-stabilized buildings in a machine readable format. When I obtained the data I was then able to do an analysis on the NYC’s taxlot dataset, MapPLUTO, to determine what properties in NYC likely have rent stabilized apartments, and are either registered or not registered with the DHCR. This is important as registration of rent-regulated apartments is essentially voluntary as it’s not enforced by any city or state agency (as far as I’m aware of) so it’s easy for landlords in NYC to lie to tenants about their apartment being rent-stabilized.
Visualizing the data was great, however I was interested in pushing the usefulness of this dataset a little further. After I shared my discovery with Caroline Woolard, an NYC based artist and activist, she suggested using the data in an app to let people know if they are rent stabilized. The rest is history.
Here are some screen shots of a few of the slides from the app’s landing page:
DHCR & MapPLUTO Data Processing
Am I Rent Stabilized? uses the dataset I created of properties that likely have rent stabilized apartments in NYC. I chose to stash this database on CartoDB so that I could take advantage of CartoDB’s SQL API and the CartoDB.JS library with the app. However I did a lot of data processing on my local machine before importing the data into CartoDB. This is mainly because the MapPLUTO dataset is too large to import into CartoDB without a paid plan that gives you more storage space.
The next part outlines how I got there.
1. Processing the DHCR Rent Stabilized Building List
The Excel workbooks I obtained from the FOIL request were normalized, stacked, and converted to a Comma Separated Value (CSV) file format using a Node JS script. This allowed the data to be geocoded in one shot and then imported into a PostgreSQL database where it could be analyzed with the NYC MapPLUTO GIS tax lot data.
2. Geocoding the Processed DHCR data
A Python script was then used to obtain values for each property’s Borough - Block - Lot number (BBL), Building Identificaiton Number (BIN), and latitude - longitude coordinates from the NYC GeoClient API. A propery’s street address and borough are passed to the GeoClient API which then returns lots of useful information about the property such as the BBL, BIN, latitude and logitude values.
3. Determining NYC Properties That Are Likely Rent Stabilized
After processing and geocoding the DHCR data it was imported into a PostgreSQL database using CSVkit’s csvsql command as follows:
From here PostgreSQL was then used to analyze the data. Here is a link to the entire SQL code, but the most important queries are the following:
These two queries tell us:
A. what properties in the MapPLUTO tax lot data match the DHCR’s rent-stabilized building list, and
B. what other properties are likely to have rent-stabilized apartments but aren’t on the DHCR list.
From here I created a table that combines data from both queries as well as a flag that states whether or not the property is listed in the DHCR data.
4. Further Data Processing Using CartoDB
Lastly, the data was imported into CartoDB and some final tweaks to the data were made. Mainly this involved removing properties that belong to the New York City Housing Authority. To find out how many different spellings of this agency name were in the table, I did a spatial intersect with the NYCHA property data.
Creating Catchment Area Polygons For Local Tenants Rights Organizations
In order to inform a user as to whether or not any local tenants rights organizations are operating within their neighborhood, custom polygon geospatial data was created to respresent each of 94 organization’s service areas. This was a necessary step as many housing rights organizations work within specific neighborhoods, zipcodes, community boards, or other boundaries and will not assist people outside of those boundaries. This is a understable decision for these groups to make given the density of NYC and the fact that something like 80% of its residents are renters. Housing orgs are more often than not under staffed and over worked so must limit who they can help to ensure they do their work effectively.
Scraping DHCR’s Community Based Housing Organizations
First, a list of Community Based Housing Organizations was scraped from an HTML table on the DHCR’s website using a Python script. Organizations that operate in the boroughs / counties that make up NYC were pulled out from the scraped data into a new table.
Creating the Catchment Areas
For these 94 organizations, polygon data was manually created representing each organization’s service area. Reference polygon geospatial data sources used to create the service areas include Pediatcities NYC Neighborhood boundaries, NYC Planning Neighborhood Tabulation Areas, U.S. Census Zipcode Tabulation Areas, and NYC Planning Community District boundaries. This data was copied and in some cases aggregated (dissolved) into a new dataset using MAPublisher, a GIS plug-in for Adobe Illustrator. In some cases boundaries had to be drawn by hand, such as for the Cooper Square Committee which operates within a very specific area in the East Village of Manhattan. Once completed, the polygon data was joined to the DHCR Community Housing Based Organizations for NYC and then exported to a shapefile format.
The data was then imported into CartoDB for use with Am I Rent Stabilized?. When a user’s address is geocoded, a “point in polygon” SQL query is made using PostGIS to the data in CartoDB.
If a user’s address is within a group’s cachment area, then that group’s information is passed into a modal in the app. This modal displays information such as the group’s website url, phone number, contact person, and/or address. As what’s present in this data varies from group to group, a Handlebars.js helper function is used to check if the data exists before passing it to the Handlebars HTML template:
The Handlebars HTML template looks like this:
That’s about it, thanks for reading and please feel free to ping me if you have any questions or comments.
Links to Code & Data
- Github Repo for Am I Rent Stabilized
- Visualization of NYC properties that likely have rent-stabilized apartments
- DHCR Rent Stabilized Building List
(note: this is just a list of addresses that have one or more rent stabilized apartments, not the actual apartment numbers)
- NYC Likely Rent-Stabilized GIS data
- NYC Local Tenants Rights Groups Service Areas
[^ Back to the top](#top)
data web-scraping cartodb web-mapping postgis sql