Database · Neon PostgreSQL
market_listing
Top-recommended listings crawled from a major e-commerce marketplace using targeted search tags. Each row represents one product surfaced by the platform's recommendation algorithm for a given keyword. Data is collected in batches — earlier batches reflect stronger algorithmic promotion signals.
Field Reference
| Field | Type | Description & Values |
|---|---|---|
| id | integer PK | Auto-incremented primary key. Unique identifier for each listing row. |
| batch_id | text |
Order of the crawl batch — "1" = first batch crawled (products that appear earliest
in platform search results, strongest recommendation signal), "8" = last batch
(appears later in results, smaller audience reach). Lower batch_id = higher platform visibility.
Values: "1" – "8"
|
| source_screenshot | text |
Filename of the screenshot used as the source for this row's data extraction.
e.g. "keepsake_1.png"
|
| search_tag | text |
The keyword searched on the platform that produced this listing in results.
Represents a target market segment.
Values: "baby girl gift" · "baby keepsake" · "birth announcement" · "keepsake"
|
| etsy_best | text |
Platform quality tier. Currently all crawled listings are star_seller,
meaning the shop meets the platform's highest seller standards (high ratings, on-time shipping, responsive).
Values: "star_seller" (all current data)
|
| product_type | text |
Normalized product category extracted from the listing title. Used to group and compare
similar products across different shops and search tags.
e.g. "ring dish" · "jewelry dish" · "photo album" · "baby romper"
|
| title | text | Full listing title as shown on the platform. Often includes personalization keywords, material, and occasion descriptors. |
| price | integer |
Current (discounted) listing price, stored as an integer in platform units.
Divide by 10,000 to get USD. Formula: price / 10000 = USD.
e.g. 111194 → $11.12 USD
|
| original_price | integer |
Pre-discount listing price in the same platform units (÷10,000 for USD).
Relationship: price ≈ original_price × (1 − discount/100).
|
| discount | integer |
Percentage discount applied to the listing (0–100). Note: 100% of current data
carries a discount — the platform strongly favors discounted listings in recommendations.
Range: 5 – 70. Most common: 41–60%
|
| shop_name | text |
Name of the seller shop on the platform.
e.g. "TreasureBoxStudioLTD" · "LanahomeCraft"
|
| rating | real |
Average star rating of the listing (1.0 – 5.0). All crawled listings score
≥ 4.0, reflecting the platform's quality filter for top recommendations.
Range in data: 4.0 – 5.0. Avg: 4.88
|
| review_count | integer |
Total number of customer reviews on this listing. Strong proxy for market demand
and listing maturity. Used to distinguish established (1k+) from emerging (<500) products.
Range: 1 – 69,400. Median: ~530
|
| badge | text |
Platform-assigned visibility badge on the listing. Indicates algorithmic promotion status
and demand signals. Null/empty = no special badge.
Values: "Popular now" · "Bestseller" · "Etsy's Pick" · null
|
| free_shipping | boolean |
Whether the listing offers free shipping to buyers. Only 16% of top listings
offer free shipping — suggesting it is not a primary ranking factor for this category.
true = 29 listings · false = 150 listings
|
| is_ad | boolean |
Whether the listing is a paid promoted advertisement. Only 2 of 180 listings
are ads — confirming that top recommendations are overwhelmingly organic.
true = 2 · false = 178
|
| import_date | date |
Date the row was imported into the database. Used for versioning and tracking
when market snapshots were taken.
Current data: 2026-04-15
|
How to Read the Data
batch_id as a ranking proxy
Products in batch_id = "1" have the strongest organic visibility — they appear first in search results. As batch_id increases, the platform shows these products to fewer users. High price + high rating in later batches (5–8) suggests the algorithm surfaces premium niche products at the end of its recommendation stack.
Price is always discounted
Every listing in this dataset carries a discount (1–70%). The platform's algorithm
systematically favors discounted listings. Setting an original_price
and applying a 40–60% discount is the dominant competitive strategy across all search tags.
Emerging vs. Established products
A product with rating ≥ 4.8 but review_count < 500 is an emerging opportunity — quality is proven, but the market isn't yet saturated. Products with 1,000+ reviews represent validated demand but have high competition barriers.
Badge = platform momentum signal
"Popular now" indicates short-term demand spikes — the platform actively promotes these listings. "Bestseller" reflects sustained historical sales volume. "Etsy's Pick" is editorial selection (rare, 3 total). A product type with many "Popular now" badges is currently trending and worth entering quickly.
Last updated: 2026-04-15 · Source: market_listing (Neon PostgreSQL)