Recipe recommendation using ingredient networks
RRecipe recommendation using ingredient networks
Chun-Yuen Teng
School of InformationUniversity of MichiganAnn Arbor, MI, USA [email protected] Yu-Ru Lin
IQSS, Harvard UniversityCCS, Northeastern UniversityBoston, MA [email protected] Lada A. Adamic
School of InformationUniversity of MichiganAnn Arbor, MI, USA [email protected]
ABSTRACT
The recording and sharing of cooking recipes, a human ac-tivity dating back thousands of years, naturally became anearly and prominent social use of the web. The resultingonline recipe collections are repositories of ingredient com-binations and cooking methods whose large-scale and vari-ety yield interesting insights about both the fundamentals ofcooking and user preferences. At the level of an individualingredient we measure whether it tends to be essential or canbe dropped or added, and whether its quantity can be modi-fied. We also construct two types of networks to capture therelationships between ingredients. The complement networkcaptures which ingredients tend to co-occur frequently, andis composed of two large communities: one savory, the othersweet. The substitute network, derived from user-generatedsuggestions for modifications, can be decomposed into manycommunities of functionally equivalent ingredients, and cap-tures users’ preference for healthier variants of a recipe. Ourexperiments reveal that recipe ratings can be well predictedwith features derived from combinations of ingredient net-works and nutrition information.
Categories and Subject Descriptors
H.2.8 [
Database Management ]: Database applications—
Data mining
General Terms
Measurement; Experimentation
Keywords ingredient networks, recipe recommendation
1. INTRODUCTION
The web enables individuals to collaboratively share knowl-edge and recipe websites are one of the earliest examples ofcollaborative knowledge sharing on the web. Allrecipes.com, the subject of our present study, was founded in 1997, yearsahead of other collaborative websites such as the Wikipedia.Recipe sites thrive because individuals are eager to sharetheir recipes, from family recipes that had been passed downfor generations, to new concoctions that they created thatafternoon, having been motivated in part by the ability toshare the result online. Once shared, the recipes are imple-mented and evaluated by other users, who supply ratingsand comments.The desire to look up recipes online may at first appearodd given that tombs of printed recipes can be found inalmost every kitchen. The Joy of Cooking [12] alone con-tains 4,500 recipes spread over 1,000 pages. There is, how-ever, substantial additional value in online recipes, beyondtheir accessibility. While the Joy of Cooking contains asingle recipe for Swedish meatballs, Allrecipes.com hosts“Swedish Meatballs I”, “II”, and “III”, submitted by differentusers, along with 4 other variants, including “The Amaz-ing Swedish Meatball”. Each variant has been reviewed,from 329 reviews for “Swedish Meatballs I” to 5 reviewsfor “Swedish Meatballs III”. The reviews not only providea crowd-sourced ranking of the different recipes, but alsomany suggestions on how to modify them, e.g. using groundturkey instead of beef, skipping the “cream of wheat” be-cause it is rarely on hand, etc.The wealth of information captured by online collabora-tive recipe sharing sites is revealing not only of the fun-damentals of cooking, but also of user preferences. The co-occurrence of ingredients in tens of thousands of recipes pro-vides information about which ingredients go well together,and when a pairing is unusual. Users’ reviews provide cluesas to the flexibility of a recipe, and the ingredients withinit. Can the amount of cinnamon be doubled? Can the nut-meg be omitted? If one is lacking a certain ingredient, can asubstitute be found among supplies at hand without a tripto the grocery store? Unlike cookbooks, which will containvetted but perhaps not the best variants for some individu-als’ tastes, ratings assigned to user-submitted recipes allowfor the evaluation of what works and what does not.In this paper, we seek to distill the collective knowledgeand preference about cooking through mining a popularrecipe-sharing website. To extract such information, we firstparse the unstructured text of the recipes and the accom-panying user reviews. We construct two types of networksthat reflect different relationships between ingredients, inorder to capture users’ knowledge about how to combine in-gredients. The complement network captures which ingre-dients tend to co-occur frequently, and is composed of two a r X i v : . [ c s . S I] M a y arge communities: one savory, the other sweet. The sub-stitute network, derived from user-generated suggestions formodifications, can be decomposed into many communities offunctionally equivalent ingredients, and captures users’ pref-erence for healthier variants of a recipe. Our experimentsreveal that recipe ratings can be well predicted by featuresderived from combinations of ingredient networks and nu-trition information (with accuracy .792), while most of theprediction power comes from the ingredient networks (84%).The rest of the paper is organized as follows. Section 2 re-views the related work. Section 3 describes the dataset. Sec-tion 4 discusses the extraction of the ingredient and comple-ment networks and their characteristics. Section 5 presentsthe extraction of recipe modification information, as well asthe construction and characteristics of the ingredient substi-tute network. Section 6 presents our experiments on reciperecommendation and Section 7 concludes.
2. RELATED WORK
Recipe recommendation has been the subject of muchprior work. Typically the goal has been to suggest recipesto users based on their past recipe ratings [15][3] or brows-ing/cooking history [16]. The algorithms then find simi-lar recipes based on overlapping ingredients, either treat-ing each ingredient equally [4] or by identifying key ingre-dients [19]. Instead of modeling recipes using ingredients,Wang et al. [17] represent the recipes as graphs which arebuilt on ingredients and cooking directions, and they demon-strate that graph representations can be used to easily ag-gregate Chinese dishes by the flow of cooking steps and thesequence of added ingredients. However, their approach onlymodels the occurrence of ingredients or cooking methods,and doesn’t take into account the relationships between in-gredients. In contrast, in this paper we incorporate the like-lihood of ingredients to co-occur, as well as the potential ofone ingredient to act as a substitute for another.Another branch of research has focused on recommend-ing recipes based on desired nutritional intake or promotinghealthy food choices. Geleijnse et al. [7] designed a proto-type of a personalized recipe advice system, which suggestsrecipes to users based on their past food selections and nutri-tion intake. In addition to nutrition information, Kamiethet al. [9] built a personalized recipe recommendation systembased on availability of ingredients and personal nutritionalneeds. Shidochi et al. [14] proposed an algorithm to extractreplaceable ingredients from recipes in order to satisfy users’various demands, such as calorie constraints and food avail-ability. Their method identifies substitutable ingredients bymatching the cooking actions that correspond to ingredientnames. However, their assumption that substitutable ingre-dients are subject to the same processing methods is less di-rect and specific than extracting substitutions directly fromuser-contributed suggestions.Ahn et al. [1] and Kinouchi et al [10] examined networksinvolving ingredients derived from recipes, with the formermodeling ingredients by their flavor bonds, and the latterexamining the relationship between ingredients and recipes.In contrast, we derive direct ingredient-ingredient networksof both compliments and substitutes. We also step beyondcharacterizing these networks to demonstrating that theycan be used to predict which recipes will be successful.
3. DATASET
Allrecipes.com is one of the most popular recipe-sharingwebsites, where novice and expert cooks alike can uploadand rate cooking recipes. It hosts 16 customized interna-tional sites for users to share their recipes in their nativelanguages, of which we study only the main, English, ver-sion. Recipes uploaded to the site contain specific instruc-tions on how to prepare a dish: the list of ingredients, prepa-ration steps, preparation and cook time, the number of serv-ings produced, nutrition information, serving directions, andphotos of the prepared dish. The uploaded recipes are en-riched with user ratings and reviews, which comment onthe quality of the recipe, and suggest changes and improve-ments. In addition to rating and commenting on recipes,users are able to save them as favorites or recommend themto others through a forum.We downloaded 46,337 recipes including all informationlisted from allrecipes.com, including several classifications,such as a region (e.g. the midwest region of US or Eu-rope), the course or meal the dish is appropriate for (e.g.:appetizers or breakfast), and any holidays the dish may beassociated with. In order to understand users’ recipe prefer-ences, we crawled 1,976,920 reviews which include reviewers’ratings, review text, and the number of users who voted thereview as useful.
The first step in processing the recipes is identifying theingredients and cooking methods from the freeform text ofthe recipe. Usually, although not always, each ingredientis listed on a separate line. To extract the ingredients, wetried two approaches. In the first, we found the maximalmatch between a pre-curated list of ingredients and the textof the line. However, this missed too many ingredients,while misidentifying others. In the second approach, weused regular expression matching to remove non-ingredientterms from the line and identified the remainder as the in-gredient. We removed quantifiers, such as e.g. “1 lb” or “2cups”, words referring to consistency or temperature, e.g.chopped or cold, along with a few other heuristics, such asremoving content in parentheses. For example “1 (28 ounce)can baked beans (such as Bush’s Original R (cid:13) )” is identifiedas “baked beans”. By limiting the list of potential termsto remove from an ingredient entry, we erred on the sideof not conflating potentially identical or highly similar in-gredients, e.g. “cheddar cheese”, used in 2450 recipes, wasconsidered different from “sharp cheddar cheese”, occurringin 394 recipes.We then generated an ingredient list sorted by frequencyof ingredient occurrence and selected the top 1000 commoningredient names as our finalized ingredient list. Each of thetop 1000 ingredients occurred in 23 or more recipes, withplain salt making an appearance in 47.3% of recipes. Theseingredients also accounted for 94.9% of ingredient entries inthe recipe dataset. The remaining ingredients were missedeither because of high specificity (e.g. yolk-free egg noodle),referencing brand names (e.g. Planters almonds), rarity (e.g.serviceberry), misspellings, or not being a food (e.g. “nylonnetting”).The remaining processing task was to identify cookingprocesses from the directions. We first identified all heatingmethods using a listing in the Wikipedia entry on cooking[18]. For example, baking, boiling, and steaming are all ways ake boil fry grill roast simmer marinate midwestmountainnortheastwest coastsouth method % i n r e c i pe s Figure 1: The percentage of recipes by region thatapply a specific heating method. of heating the food. We then identified mechanical ways ofprocessing the food such as chopping and grinding, and otherchemical techniques such as marinating and brining.
Choosing one cooking method over another appears to bea question of regional taste. 5.8% of recipes were classifiedinto one of five US regions: Mountain, Midwest, Northeast,South, and West Coast (including Alaska and Hawaii). Fig-ure 1 shows significantly ( χ test p-value <
4. INGREDIENT COMPLEMENT NETWORK
Can we learn how to combine ingredients from the data?Here we employ the occurrences of ingredients across recipesto distill users’ knowledge about combining ingredients.We constructed an ingredient complement network basedon pointwise mutual information (PMI) defined on pairs ofingredients ( a, b ):PMI(a , b) = log p ( a, b ) p ( a ) p ( b ) , where p ( a, b ) = a and b ,p ( a ) = a ,p ( b ) = b . The PMI gives the probability that two ingredients occurtogether against the probability that they occur separately.Complementary ingredients tend to occur together far moreoften than would be expected by chance.Figure 2 shows a visualization of ingredient complemen-tarity. Two distinct subcommunities of recipes are imme-diately apparent: one corresponding to savory dishes, theother to sweet ones. Some central ingredients, e.g. egg andsalt, actually are pushed to the periphery of the network.They are so ubiquitous, that although they have many edges,they are all weak, since they don’t show particular comple-mentarity with any single group of ingredients.We further probed the structure of the complementaritynetwork by applying a network clustering algorithm [13].The algorithm confirmed the existence of two main clusterscontaining the vast majority of the ingredients. An interest-ing satellite cluster is that of mixed drink ingredients, whichis evident as a constellation of small nodes located near thetop of the sweet cluster in Figure 2. The cluster includesthe following ingredients: lime, rum, ice, orange, pineapplejuice, vodka, cranberry juice, lemonade, tequila, etc.For each recipe we recorded the minimum, average, andmaximum pairwise pointwise mutual information betweeningredients. The intuition is that complementary ingredi-ents would yield higher ratings, while ingredients that don’tgo together would lower the average rating. We found thatwhile the average and minimum pointwise mutual informa-tion between ingredients is uncorrelated with ratings, themaximum is very slightly positively correlated with the av-erage rating for the recipe ( ρ = 0 .
09, p-value < − ). Thissuggests that having at least two complementary ingredientsvery slightly boosts a recipe’s prospects, but having clashingor unrelated ingredients does not seem to do harm.
5. RECIPE MODIFICATIONS
Co-occurrence of ingredients aggregated over individualrecipes reveals the structure of cooking, but tells us littleabout how flexible the ingredient proportions are, or whethersome ingredients could easily be left out or substituted. Anexperienced cook may know that apple sauce is a low-fat al-ternative to oil, or may know that nutmeg is often optional,but a novice cook may implement recipes literally, afraidthat deviating from the instructions may produce poor re-sults. While a traditional hardcopy cookbook would providefew such hints, they are plentiful in the reviews submittedby users who implemented the recipes, e.g. “This is a greatrecipe, but using fresh tomatoes only adds a few minutes tothe prep time and makes it taste so much better” , or anothercomment about the same salsa recipe “This is by far the bestrecipe we have ever come across. We did however change itjust a little bit by adding extra onion.”
As the examples illustrate, modifications are reported evenwhen the user likes the recipe. In fact, we found that 60.1%of recipe reviews contain words signaling modification, suchas “add”, “omit”, “instead”, “extra” and 14 others. Further-more, it is the reviews that include changes that have a sta-tistically higher average rating (4.49 vs. 4.39, t-test p-value < − ), and lower rating variance (0.82 vs. 1.05, Bartletttest p-value < − ), as is evident in the distribution ofratings, shown in Fig. 3. This suggests that flexibility inrecipes is not necessarily a bad thing, and that reviewerswho don’t mention modifications are more likely to think ofthe recipe as perfect, or to dislike it entirely. herry gelatin graham cracker low fat cottage cheese pork shoulder roast heavy whipping cream tofu bok choy butter cracker baking soda pimento pepper milk powder chorizo sausage ladyfinger steak sauce crimushroom radishe shiitake mushroom pesto brownie mix pumpkin pie spice rye flour cardamom saffron thread linguine corn fat free sour cream basmati rice bittersweet chocolate bay corn chip cracker french green bean poppy seed vegetable oil grape tomato pizza crust doughlow sodium beef broth club soda lard soy sauce panko bread couscou crab meat mango unpastry shell catalina dressing pasta shell italian salad dressing mexican corndecorating gel italian bread napa cabbage onion powder white wine vinegar cocktail rye bread basil sauce crouton brown gravy mix barbeque sauce apple cider vinegar hoagie roll milk chocolate candy kisse flounder salt black pepper maraschino cherry juice chow mein noodle tiger prawn banana pepper cranberry vermicelli pasta root beer strawberry jam lemon gelatin mix creamed corn pretzel pie shell sunflower kernel rump roastromaine vegetable stock lemon pepper seasoning guacamole louisiana hot sauce cabbage yellow onion superfine sugar orange peelraspberry cumin seed candied mixed fruit peel cream of coconut bow tie pasta creme fraiche currant pork chop turkey gravy fat free half and halfchicken ramen noodle wooden skewer whipping cream mace seasoning salt mozzarella cheese pasta sauce lean pork broccoli floweret tomatillo lemonade tomato paste caesar dressing basil pesto melon liqueur coconut milk whole wheat pastry flour muenster cheese lump crab meat angel food cake ring cheese tortellini spiral pasta vanilla pudding caulifloweret smoked sausage hot dog pita bread cocoa powder garbanzo beantart apple wheat bran hot pepper sauce chili refried bean salmon steak white cheddar cheese low fat mayonnaise grapefruit dijon mustard tomato juice yellow squash baking apple cream of tartar vodka rye bread white chip flat iron steak linguine pasta fennel whole wheat bread baking mix alfredo pasta sauce margarine confectioners' sugar fruit gelatin mix pork balsamic vinegar pork loin chop jicama pre pizza crust triple sec teriyaki sauce cola carbonated beverage polish sausage cracked black pepper poblano chile pepper individually wrapped caramel roast beef bread stuffing mix eggnog pear caramel beet worcestershire sauce chicken stock horseradish semisweet chocolate chip basil red grape plum cinnamon sugar fajita seasoning rice noodlepowdered milk star anise pod short grain rice ramen noodle vegetable coconut oil whiskey lime gelatin mix peanut oil ham ginger root lima bean pimento stuffed green olive hoisin sauce round steak stuffing part skim ricotta cheese broiler fryer chicken up milk chocolate chip turbinado sugar vegetable shortening tarragon vinegar golden delicious apple turkey rigatoni pastastuffing mix milk juiced burgundy wine red kidney bean dill candied pineapple german chocolate cake mix arborio rice sugar free vanilla pudding mix pine nut green apple cucumber oregano pearl onion stuffed green olive whipped topping mix broccoli pinto bean pasta beef short rib gelatin garlic powder rutabaga chicken liver pepperjack cheese herb lemon gras sweet potato pineapple ring parsley flake pie filling spice cake mix butterscotch chip greek yogurt vanilla ice cream seafood seasoning parsnip applesauce chinese five spice powder salt pepper beef broth cherry tomato sage vanilla vital wheat gluten artichoke heart mixed berry bacon dripping self rising flour nilla wafer navy bean bacon egg yolk wonton wrapper chocolate pudding mix salsa coconut tomato based chili sauce marsala wine mussel manicotti shell anise extract mustard seed nutmeg cayenne pepper black bean pepper okra asparagu mustard powder firmly brown sugar balsamic vinaigrette dressing chicken breast oyster ditalini pasta old bay seasoning tm brown rice process american cheese chocolate miso paste pineapple iceberg lettuce pearl barley oat greek seasoning biscuit clove browning sauce chicken bouillon powder green pea bread dough cream cheese peanut butter chip silken tofu pineapple chip sea scallop ricotta cheese papaya red cabbage egg substitute zesty italian dressing devil's food cake mix bagel sour mix lamb irish stout beer sea salt romaine lettuce kalamata olive salt monosodium glutamate rice wine white potato rum extract grape jelly crescent roll dough beer phyllo dough fettuccine pasta chili seasoning mix biscuit mix candy coated chocolate green cabbage ranch bean cream of celery soup apple pie filling caper nectarine white mushroom banana orange gelatin mix1% buttermilk apple jelly dinner roll sugar pumpkin salad green shrimp cheese ravioli chicken wing sour cream saltine cornmeal mixed vegetable beef tenderloin sherry rotini pasta mexican cheese blend kosher salt black pepper mayonnaise lobster white onion chocolate cookie white bread french baguette bread vanilla frosting anise seed ranch dressing mixwild rice hot canadian bacon cornflakes cereal wax bean cantaloupe non fat yogurt lite whipped topping spaghetti squash egg roll wrapper solid pack pumpkin recipe pastry asafoetida powder coffee powder italian sauce amaretto liqueur shortening turmeric semolina flour pomegranate juice corned beef skewer shallot spanish onion tapioca provolone cheese chile sauce vanilla bean chile pepper angel hair pasta pumpkin tilapia brie cheese cottage cheese banana liqueur lemon smoked salmon ginger paste brown mustard peanut butter escarole sour milk olive oil country pork rib pastry shell adobo seasoningcandy coated milk chocolate curry ghee alfredo sauce yellow cake mix granny smith apple beef chuck chocolate hazelnut spread maple syrup squid gingersnap cooky raspberry gelatin molasse lemon cake mix fish stock cook grenadine syrup puff pastry rum grapefruit juice tahini black pepper butternut squash key lime juice sirloin steak macaroni butter shortening brown lentil chicken broth chili bean pickling spice yellow food coloring great northern bean mixed nut green chile salmon english muffin coffee liqueur non fat milk powder buttermilk distilled white vinegar golden syrup powdered fruit pectin green chily grape raspberry gelatin mix low fat sour cream topping pineapple juice red lettuce orange zest ketchup chunk chicken steak seasoning sandwich roll crystallized ginger kosher salt roma tomato red beanred candied cherry sesame seed beef stock cashew popped popcorn apricot nectar any fruit jam processed cheese food red pepper coleslaw mix white cake mix cherry pie filling canola oil whole wheat flour honey long grain marinara sauce yellow summer squash toffee baking bit whole milk trout onion separated low fat cream cheese corn oil oat bran cream of potato soup allspice berry mandarin orange cumin saltine cracker swiss chard fenugreek seed fish sauce eggplant baby corn cider vinegar orange sherbet debearded beef bouillon kernel corn vanilla vodka chicken leg quarter mint feta cheese lime juice raspberry jam cooking oil white corn herb stuffing mix lemon lime soda pork sausage ziti pasta orange marmalade yogurt bean ginger garlic paste crescent dinner roll scallop walnut oilsmoked ham red food coloring triple sec liqueur fat free evaporated milk walnut baking chocolate blueberry caramel ice cream topping bacon greasefat free italian dressing steak fig miracle whip ‚Ñ potato starch luncheon meatbrandy based orange liqueur smoked paprika puff pastry shell raspberry preserve apple butter tomato saucewhite rice beef stew meat taco seasoning mix date whipped topping marshmallow coffee butterscotch schnapp red wine vinegar orange chicken thigh mild italian sausage blueberry pie filling yeast lime peel rice flour chocolate cake mix barbecue sauce monterey jack cheese halibut beef round steak seed sour cherry pork sparerib orange roughy barley nugget cereal leek maraschino cherry chickpea fettuccini pasta orange juice blue cheese dressing yam garam masala black eyed pea penne pasta serrano chile pepper flour chive marjoram herb stuffing beef sirloin beef maple extract bamboo shoot lemon extract meat tenderizer kielbasa sausage low sodium chicken broth asparagus cod italian seasoning lime gelatin vegetable bouillon andouille sausage collard green blackberry beef gravy green grape tamari fruit malt vinegar strawberry gelatin lemon gelatin green olive poultry seasoning prune beef consomme chili powder dressing fennel seed gruyere cheese jellied cranberry sauce chipotle pepper vanilla extract apricot linguini pasta cranberry sauce port wine process cheese cornish game hen cilantro green chile pepper wheat bread machine yeast tube pasta biscuit baking mix cream corn spinach low fat whipped topping irish cream liqueur candy zucchini mild cheddar cheese orange gelatin cornstarch cheese snow pea low fat margarine green candied cherry vermouth brandy white grape juice corn bread mix broccoli floret vidalia onioncocktail sauce pickled jalapeno pepperbeaten egg hamburger bun black walnut dill pickle juice dill pickle relish habanero pepper white chocolate chip veal powdered non dairy creamer lasagna noodle ginger apricot jam imitation crab meat chicken soup base white bean tarragon onion soup mix thousand island dressing red lentil pancake mix wheat germ fat free mayonnaise yukon gold potato long grain rice carrot cauliflower floret vegetable cooking spray crawfish tail peppermint extract brussels sprout onion salt buttermilk biscuit white kidney beanmango chutney black olive meatless spaghetti sauce curry powder coriander red snapper biscuit dough sausage cheddar cheese soup lettuce pork loin roast lemon pepper red curry paste egg noodle hot sauce raspberry vinegar butter cooking spray peach schnapp egg spicy pork sausage mixed fruit catfish venison yellow pepper carbonated waterpumpkin seed new potato lemon juice chocolate pudding watermelon chicken breast half gorgonzola cheese buttery round cracker apple pie spice process cheese sauce jasmine rice lemon pudding mix cooking sherry strawberry preserve french bread toothpick sauce corn tortilla chip garlic paste salt free seasoning blend elbow macaroni pickle cream of chicken soup cardamom pod persimmon pulp chicken liquid smoke cocoa pound cake bell pepper food coloring coconut extract chocolate chip berry cranberry sauce red bell pepper seashell pasta american cheese oatmeal sourdough bread cornbread mixed salad green arugula oil parmesan cheese clam juicebrick cream cheese cereal italian parsley milk chocolate rice wine vinegar hot dog bun pistachio pudding mix curd cottage cheese garlic salt chocolate cookie crust orange extract cream of mushroom soup saffron mushroom tortilla chip white hominy green beans snapped dill pickle french onion soup skim milk tequila flax seed low fat cheddar cheese red wine nut apple cidercandied cherry cheddar cheese gingerroot chocolate frosting low fat yogurt peppercorn pepperoni artichoke baby pea crisp rice cereal potato chip coconut cream angel food cake mix onion flake salad shrimp taco seasoning champagne peach low fat yellow cornmeal pork roast baby spinach portobello mushroom cap blue cheese strawberry gelatin mix pink lemonade chestnut strawberry oyster sauce sugar snap pea kaffir lime anchovy stuffed olive herb bread stuffing mix half and half serrano pepper coconut rum red apple cherry flank steak round peppermint candy butter bean almond white vinegar celery seed corn syrup fat free cream cheese cannellini bean clam mustard scallion potato flake parsley fat free yogurt pita bread round red pepper flake onion bourbon whiskey creme de menthe liqueur golden raisin pancetta bacon apple juice egg white fontina cheese kale asiago cheese spiced rum farfalle pasta lobster tail mirin leg of lamb tomato zested sauerkraut unpie crust bourbon lean beef tuna steak wild rice mix raisin chocolate syrup juice cajun seasoning cauliflower water lemon yogurt tapioca flour vanilla yogurt pimiento hazelnut liqueur thyme part skim mozzarella cheese mandarin orange segment cinnamon corn tortilla crispy rice cereal colby monterey jack cheese apricot preserve chipotle chile powder swiss cheese white wine baking powder graham cracker crust vanilla wafer lime sugar based curing mixture cream cheese spread celery olive simple syrup asian sesame oil bacon bit sharp cheddar cheese rice vinegar sea salt black pepper curry paste beef chuck roast butter extract pork loin ginger ale chicken legadobo sauce lime zest ham hock watercres pastry seasoning lentil mascarpone cheese baker's semisweet chocolate acorn squash chunk chicken breast pepperoni sausage brown sugar fusilli pasta kaiser roll red delicious apple honey mustard unbleached flour vinegar spicy brown mustard chuck roastcandied citron vegetable combination beef flank steak red chile pepper avocado quinoa cake flour whole wheat tortilla dill seed turnip vegetable broth sugar sugar cookie mix neufchatel cheese coriander seed apple vegetable soup mix chocolate sandwich cooky colby cheese sourdough starter green bean pecan softened butter matzo meal hash brown potato vanilla pudding mix pickle relish noodle red potato white chocolate pistachio nut green food coloring lemon zest chutney splenda buttermilk baking mix caraway seed maple flavoring taco saucechili oil kiwi lean turkey garlic golden mushroom soup grit chili sauce rosemary green salsa corkscrew shaped pasta marshmallow creme enchilada sauce baby carrot savory cinnamon red candy corn muffin mix black peppercorn green bell pepper water chestnut french dressing almond extract rose water paprika english cucumber nutritional yeast unpie shell ears corn cream of shrimp soup plum tomato bratwurst green lettuce lemon lime carbonated beverage ice creole seasoning grape juice italian sausage pizza crust orzo pasta white rum crescent roll italian cheese blend rhubarb chicken bouillon prosciutto cream red onion marinated artichoke heart jalapeno chile pepper tater tot pork tenderloin spaghetti gin semisweet chocolate pie crust cooking spray spaghetti sauce bread flour butterscotch pudding mix romano cheese bulgur hungarian paprika white balsamic vinegar picante sauce meatball tuna chili without bean bean sprout baking cocoa chile paste butter yellow mustard haddock sunflower seed processed american cheese russet potato allspice giblet button mushroom peanut kidney bean portobello mushroom ranch dressing almond paste hazelnut beef brisket sake fruit cocktail beef sirloin steak pimento honeydew melon low fat milk salami german chocolate pizza sauce green tomato orange liqueur celery salt chocolate mix cranberry juice white pepper barley soy milk sweet poblano pepper macadamia nut goat cheese tomato soup tea bag mixed spicelow fat peanut butter turkey breast lemon peel tomato vegetable juice cocktail jalapeno pepper low sodium soy sauce processed cheese limeade artificial sweetener sesame oil heavy cream fat free chicken broth pork shoulder evaporated milk cornflake bay scallop chocolate wafer white sugar rapid rise yeast potato flour tortilla chicken drum chocolate ice cream pepper jack cheese baking potato italian dressing mix Figure 2: Ingredient complement network. Two ingredients share an edge if they occur together more thanwould be expected by chance and if their pointwise mutual information exceeds a threshold. rating p r opo r t i on o f r e v i e w s w i t h g i v en r a t i ng . . . . . . . no modification with modification Figure 3: The likelihood that a review suggests amodification to the recipe depends on the star ratingthe review is assigning to the recipe.
In the following, we describe the recipe modifications ex-tracted from user reviews, including adjustment, deletionand addition. We then present how we constructed an in-gredient substitute network based on the extracted informa-tion.
Some modifications involve increasing or decreasing theamount of an ingredient in the recipe. In this and the fol-lowing analyses, we split the review on punctuation suchas commas and periods. We used simple heuristics to de-tect when a review suggested a modification: adding/usingmore/less of an ingredient counted as an increase/decrease.Doubling or increasing counted as an increase, while reduc-ing, cutting, or decreasing counted as a decrease. While it islikely that there are other expressions signaling the adjust-ment of ingredient quantities, using this set of terms allowed us to compare the relative rate of modification, as well asthe frequency of increase vs. decrease between ingredients.The ingredients themselves were extracted by performing amaximal character match within a window following an ad-justment term.Figure 4 shows the ratios of the number of reviews sug-gesting modifications, either increases or decreases, to thenumber of recipes that contain the ingredient. Two patternsare immediately apparent. Ingredients that may be per-ceived as being unhealthy, such as fats and sugars, are, withthe exception of vegetable oil and margarine, more likelyto be modified, and to be decreased. On the other hand,flavor enhancers such as soy sauce, lemon juice, cinnamon,Worcestershire sauce, and toppings such as cheeses, baconand mushrooms, are also likely to be modified; however, theytend to be added in greater, rather than lesser quantities.Combined, the patterns suggest that good-tasting but “un-healthy” ingredients can be reduced, if desired, while spices,extracts, and toppings can be increased to taste.
Recipes are also frequently modified such that ingredientsare omitted entirely. We looked for words indicating thatthe reviewer did not have an ingredient (and hence did notuse it), e.g. “had no” and “didn’t have”. We further used“omit/left out/left off/bother with” as indication that thereviewer had omitted the ingredients, potentially for otherreasons. Because reviewers often used simplified terms, e.g.“vanilla” instead of “vanilla extract”, we compared words inproximity to the action words by constructing 4-character-grams and calculating the cosine similarity between the n-grams in the review and the list of ingredients for the recipe.To identify additions, we simply looked for the word“add”,but omitted possible substitutions. For example, we woulduse “added cucumber”, but not “added cucumber instead ofgreen pepper”, the latter of which we analyze in the follow-ing section. We then compared the addition to the list ofingredients in the recipes, and considered the addition validonly if the ingredient does not already belong in the recipe. .01 0.02 0.05 0.10 0.20 0.50 . . . . . . . ( ( r e v i e w s ad j u s t i ng up ) / ( r e c i pe s ) salt butteregg flour white sugarwateroniongarlic milkvanilla extractpepper olive oilvegetable oil brown sugarblack pepper sugarcinnamontomatomargarine baking powderbaking sodalemon juiceparsley cs’. sugarparmesancelery cream cheesegreen bell pepper carrotwalnut cheddar sour creamgarlic powder chicken breastnutmegbasilpecanmushroom mayonnaisechicken brothpotato soy sauceoregano cornstarchshortening honeychocolate chipbaconworcestershire s. Figure 4: Suggested modifications of quantity forthe 50 most common ingredients, derived fromrecipe reviews. The line denotes equal numbers ofsuggested quantity increases and decreases.
Table 1 shows the correlation between ingredient modifi-cations. As might be expected, the more frequently an in-gredient occurs in a recipe, the more times its quantity hasthe opportunity to be modified, as is evident in the strongcorrelation between the the number of recipes the ingredientoccurs in and both increases and decreases recommended inreviews. However, the more common an ingredient, the morestable it appears to be. Recipe frequency is negatively cor-related with deletions/recipe ( ρ = − . ρ = − . ρ = − . Table 1: Correlations between ingredient modifica-tions addition deletion increase decrease
Replacement relationships show whether one ingredientis preferable to another. The preference could be basedon taste, availability, or price. Some ingredient substitu-tion tables can be found online , but are neither extensivenor contain information about relative frequencies of each e.g., http://allrecipes.com/HowTo/common-ingredient-substitutions/detail.aspx Figure 5: Ingredient substitute network. Nodes aresized according to the number of times they havebeen recommended as a substitute for another in-gredient, and colored according to their indegree. substitution. Thus, we found an alternative source for ex-tracting replacement relationships – users’ comments, e.g. “I replaced the butter in the frosting by sour cream, just tosoothe my conscience about all the fatty calories” .To extract such knowledge, we first parsed the reviewsas follows: we considered several phrases to signal replace-ment relationships: “replace a with b ”, “substitute b for a ”,“ b instead of a ”, etc, and matched a and b to our list ofingredients.We constructed an ingredient substitute network to cap-ture users’ knowledge about ingredient replacement. Thisweighted, directed network consists of ingredients as nodes.We thresholded and eliminated any suggested substitutionsthat occurred fewer than 5 times. We then determined theweight of each edge by p ( b | a ), the proportion of substitu-tions of ingredient a that suggest ingredient b . For example,68% of substitutions for white sugar were to splenda, anartificial sweetener, and hence the assigned weight for the sugar → splenda edge is 0.68.The resulting substitution network, shown in Figure 5,exhibits strong clustering. We examined this structure byapplying the map generator tool by Rosvall et al. [13], whichuses a random walk approach to identify clusters in weighted,directed networks. The resulting clusters, and their relation-ships to one another, are shown in Fig. 6. The derived clus-ters could be used when following a relatively new recipewhich may not receive many reviews, and therefore manysuggestions for ingredient substitutions. If one does not haveall ingredients at hand, one could examine the content ofone’s fridge and pantry and match it with other ingredientsfound in the same cluster as the ingredient called for bythe recipe. Table 2 lists the contents of a few such sampleingredient clusters, and Fig. 7 shows two example clustersextracted from the substitute network. able 2: Clusters of ingredients that can be substi-tuted for one another. A maximum of 5 additionalingredients for each cluster are listed, ordered byPageRank. main other ingredientschicken turkey, beef, sausage, chicken breast, baconolive oil butter, apple sauce, oil, banana, margarinesweet yam, potato, pumpkin, butternut squash,potato parsnipbaking baking soda, cream of tartarpowderalmond pecan, walnut, cashew, peanut, sunflower s.apple peach, pineapple, pear, mango, pie fillingegg egg white, egg substitute, egg yolktilapia cod, catfish, flounder, halibut, orange roughyspinach mushroom, broccoli, kale, carrot, zucchiniitalian basil, cilantro, oregano, parsley, dillseasoningcabbage coleslaw mix, sauerkraut, bok choynapa cabbageFinally, we examine whether the substitution network en-codes preferences for one ingredient over another, as evi-denced by the relative ratings of similar recipes, one whichcontains an original ingredient, and another which imple-ments a substitution. To test this hypothesis, we constructa “preference network”, where one ingredient is preferred toanother in terms of received ratings, and is constructed bycreating an edge ( a, b ) between a pair of ingredients, where a and b are listed in two recipes X and Y respectively, if reciperatings R X > R Y . For example, if recipe X includes beef,ketchup and cheese, and recipe Y contains beef and pick-les, then this recipe pair contributes to two edges: one frompickles to ketchup, and the other from pickles to cheese. Theaggregate edge weights are defined based on PMI. BecausePMI is a symmetric quantity (PMI( a ; b ) = PMI( b ; a )), weintroduce a directed PMI measure to cope with the direc-tionality of the preference network:PMI( a → b ) = log p ( a → b ) p ( a ) p ( b ) , where p ( a → b ) = a to b , and p ( a ), p ( b ) are defined as in the previous section.We find high correlation between this preference networkand the substitution network ( ρ = 0 . , p < .
6. RECIPE RECOMMENDATION
We use the above insights to uncover novel recommen-dation algorithms suitable for recipe recommendations. Weuse ingredients and the relationships encoded between themin ingredient networks as our main feature sets to predictrecipe ratings, and compare them against features encod-ing nutrition information, as well as other baseline featuressuch as cooking methods, and preparation and cook time. chicken,..tilapia,.. italian seasoning,..seasoning,..onion,.. garlic,..chicken broth,..milk,..sour cream,..honey,.. olive oil,..spinach,..bread,.. apple,..sweet potato,..cinnamon,..black bean,..flour,.. tomato,..sauce,..lemon juice,..pepper,..brown rice,.. white wine,..strawberry,..spaghetti sauce,..almond extract,..vanilla,..cheese,.. almond,..chocolate chip,.. baking powder,..cream of mushroom soup,..egg,.. cranberry,..pie crust,.. cabbage,..celery,.. champagne,..coconut milk,..corn chip,.. sea scallop,.. apple juice,..hoagie roll,.. iceberg lettuce,..cottage cheese,..golden syrup,.. black olive,..pickle,..red potato,..quinoa,..graham cracker,.. lemon cake mix,..imitation crab meat,.. peach schnapp,..hot,.. vegetable shortening,..pumpkin seed,.. lemonade,..curry powder,..dijon mustard,..sugar snap pea,.. smoked paprika,..
Figure 6: Ingredient substitution clusters. Nodesrepresent clusters and edges indicate the presence ofrecommended substitutions that span clusters. Eachcluster represents a set of related ingredients whichare frequently substituted for one another. milk heavy whipping cream whole milk skim milk whipping cream heavy cream buttermilk soy milk half and half evaporated milk cream cinnamon ginger pumpkin pie spice cardamom nutmeg allspice ginger root clove mace (a) milk substitutes (b) cinammon substitutes
Figure 7: Relationships between ingredients locatedwithin two of the clusters from Fig. 6.
Then we apply a discriminative machine learning method,stochastic gradient boosting trees [6], to predict recipe rat-ings.In the experiments, we seek to answer the following threequestions. (1) Can we predict users’ preference for a newrecipe given the information present in the recipe? (2) Whatare the key aspects that determine users’ preference? (3)Does the structure of ingredient networks help in recipe rec-ommendation, and how?
The goal of our prediction task is: given a pair of similarrecipes, determine which one has higher average rating thanthe other . This task is designed particularly to help userswith a specific dish or meal in mind, and who are trying todecide between several recipe options for that dish.
Recipe pair data.
The data for this prediction taskconsists of pairs of similar recipes. The reason for select-ing similar recipes, with high ingredient overlap, is thatwhile apples may be quite comparable to oranges in thecontext of recipes, especially if one is evaluating salads ordesserts, lasagna may not be comparable to a mixed drink.To derive pairs of related recipes, we computed similarityith a cosine similarity between the ingredient lists for thetwo recipes, weighted by the inverse document frequency, log ( of recipes/ of recipes containing the ingredient ).We considered only those pairs of recipes whose cosine sim-ilarity exceeded 0.2. The weighting is intended to identifyhigher similarity among recipes sharing more distinguishingingredients, such as Brussels sprouts, as opposed to recipessharing very common ones, such as butter.A further challenge to obtaining reliable relative rankingsof recipes is variance introduced by having different userschoose to rate different recipes. In addition, some usersmight not have a sufficient number of reviews under theirbelt to have calibrated their own rating scheme. To con-trol for variation introduced by users, we examined recipepairs where the same users are rating both recipes and arecollectively expressing a preference for one recipe over an-other. Specifically, we generated 62,031 recipe pairs ( a, b )where rating i ( a ) > rating i ( b ), for at least 10 users i , andover 50% of users who rated both recipe a and recipe b . Fur-thermore, each user i should be an active enough reviewerto have rated at least 8 other recipes. Features.
In the prediction dataset, each observationconsists of a set of predictor variables or features that rep-resent information about two recipes, and the response vari-able is a binary indicator of which gets the higher rating onaverage. To study the key aspects of recipe information, weconstructed different set of features, including: • Baseline: This includes cooking methods, such as chop-ping, marinating, or grilling, and cooking effort de-scriptors, such as preparation time in minutes, as wellas the number of servings produced, etc. These fea-tures are considered as primary information about arecipe and will be included in all other feature setsdescribed below. • Full ingredients: We selected up to 1000 popular ingre-dients to build a “full ingredient list”. In this featureset, each observed recipe pair contains a vector withentries indicating whether an ingredient from the fulllist is present in either recipe in the pair. • Nutrition: This feature set does not include any in-gredients but only nutrition information such the totalcaloric content, as well as quantities of fats, carbohy-drates, etc. • Ingredient networks: In this set, we replaced the fullingredient list by structural information extracted fromdifferent ingredient networks, as described in Sections 4and 5.3. Co-occurrence is treated separately as a rawcount, and a complementarity, captured by the PMI. • Combined set: Finally, a combined feature set is con-structed to test the performance of a combination offeatures, including baseline, nutrition and ingredientnetworks.To build the ingredient network feature set, we extractedthe following two types of structural information from theco-occurrence and substitution networks, as well as the com-plement network derived from the co-occurrence informa-tion:
Network positions are calculated to represent how a recipe’singredients occupy positions within the networks. Such po-sition measures are likely to inform if a recipe contains any“popular” or “unusual” ingredients. To calculate the posi-tion measures, we first calculated various network centrality baselinefull ingredientsnutritioning. networkscombined Accuracy .
60 0 .
65 0 .
70 0 .
75 0 . Figure 8: Prediction performance. The nutritioninformation and ingredient networks are more effec-tive features than full ingredients. The ingredientnetwork features lead to impressive performance,close to the best performance. measures, including degree centrality, betweenness central-ity, etc., from the ingredient networks. A centrality measurecan be represented as a vector (cid:126)g where each entry indicatesthe centrality of an ingredient. The network position of arecipe, with its full ingredient list represented as a binaryvector (cid:126)f , can be summarized by (cid:126)g T · (cid:126)f , i.e., an aggregatedcentrality measure based on the centrality of its ingredients. Network communities provide information about whichingredient is more likely to co-occur with a group of otheringredients in the network. A recipe consisting of ingredientsthat are frequently used with, complemented by or substi-tuted by certain groups may be predictive of the ratingsthe recipe will receive. To obtain the network communityinformation, we applied latent semantic analysis (LSA) onrecipes. We first factorized each ingredient network, rep-resented by matrix W , using singular value decomposition(SVD). In the matrix W , each entry W ij indicates whetheringredient i co-occurrs, complements or substitues ingredi-ent j .Suppose W k = U k Σ k V Tk is a rank- k approximation of W ,we can then transform each recipe’s full ingredient list usingthe low-dimensional representation, Σ − k V Tk (cid:126)f , as communityinformation within a network. These low-dimensional vec-tors, together with the vectors of network positions, consti-tute the ingredient network features. Learning method.
We applied discriminative machinelearning methods such as support vector machines (SVM) [2]and stochastic gradient boosting trees [5] to our predictionproblem. Here we report and discuss the detailed resultsbased on the gradient boosting tree model. Like SVM, thegradient boosting tree model seeks a parameterized classi-fier, but unlike SVM that considers all the features at onetime, the boosting tree model considers a set of featuresat a time and iteratively combines them according to theirempirical errors. In practice, it not only has competitiveperformance comparable to SVM, but can serve as a featureranking procedure [11].In this work, we fitted a stochastic gradient boosting treemodel with 8 terminal nodes under an exponential loss func-tion. The dataset is roughly balanced in terms of whichrecipe is the higher-rated one within a pair. We randomly eature i m po r t an c e group nutrition (6.5%)cook effort (5.0%)ing. networks (84%)cook methods (3.9%) Figure 9: Relative importance of features in thecombined set. The individual items from nutri-tion information are very indicative in differentiat-ing highly rated recipes, while most of the predictionpower comes from ingredient networks. feature i m po r t an c e network substitution (39.8%)co−occurrence (30.9%)complement (29.2%) Figure 10: Relative importance of features repre-senting the network structure. The substitution net-work has the strongest contribution ( . ) to thetotal importance of network features, and it also hasmore influential features in the top 100 list, whichsuggests that the substitution network is comple-mentary to other features. divided the dataset into a training set (2/3) and a testingset (1/3). The prediction performance is evaluated based onaccuracy, and the feature performance is evaluated in termsof relative importance [8]. For each single decision tree, oneof the input variables, x j , is used to partition the region as-sociated with that node into two subregions in order to fitto the response values. The squared relative importance ofvariable x j is the sum of such squared improvements overall internal nodes for which it was chosen as the splittingvariable, as: imp ( j ) = (cid:88) k ˆ i k I (splits on x j )where ˆ i k is the empirical improvement by the k -th nodesplitting on x j at that point. The overall prediction performance is shown in Fig. 8.Surprisingly, even with a full list of ingredients, the pre-diction accuracy is only improved from .712 (baseline) to feature i m po r t an c e nutrition carbs (20.9%)cholesterol (17.7%)calories (19.7%)sodium (16.8%)fiber (12.3%)fat (12.4%) Figure 11: Relative importance of features fromnutrition information. The carbs item is the mostinfluential feature in predicting higher-rated recipes. .746. In contrast, the nutrition information and ingredientnetworks are more effective (with accuracy .753 and .786, re-spectively). Both of them have much lower dimensions (fromtens to several hundreds), compared with the full ingredientsthat are represented by more than 2000 dimensions (1000ingredients per recipe in the pair). The ingredient networkfeatures lead to impressive performance, close to the bestperformance given by the combined set (.792), indicatingthe power of network structures in recipe recommendation.Figure 9 shows the influence of different features in thecombined feature set. Up to 100 features with the highestrelative importance are shown. The importance of a featuregroup is summarized by how much the total importance iscontributed by all features in the set. For example, thebaseline consisting of cooking effort and cooking methodscontribute 8 .
9% to the overall performance. The individualitems from nutrition information are very indicative in differ-entiating highly-rated recipes, while most of the predictionpower comes from ingredient networks (84%).Figure 10 shows the top 100 features from the three net-works. In terms of the total importance of ingredient net-work features, the substitution network has slightly strongercontribution (39 . imensions A cc u r a cy l l l l l l l
10 20 30 40 50 60 70 network l combinedsubstitutioncomplementco−occurrence Figure 12: Prediction performance over reduceddimensionality. The best performance is given byreduced dimension k = 50 when combining all threenetworks. In addition, using the information aboutthe complement network alone is more effective inprediction than using other two networks. c h i ck en b r ea s t po r k i t a li an s au s age s au s age c h i ck en t u r k e y c o c onu t e x t r a c t w a l nu t li m e j u i c e l e m on e x t r a c t c ho c o l a t e pudd i ng a l m ond e x t r a c t c r ea m o f c h i ck en s oup bee f a l m ond k a l e v an ill a v an ill a e x t r a c t e v apo r a t ed m il k s ou r c r ea m bu tt e r m il k c h i ck en b r o t h ha l f and ha l f m il k b r o w n s uga r bu tt e r hone y app l e s au c e o li v e o il s p l enda − . V a l ue C o l o r K ey ch i cken b r eas t po r k it a li an sausagesausagech i cken t u r keycoconu t ex tr ac t wa l nu tli m e j u i ce l e m on ex tr ac t choco l a t e pudd i nga l m ond ex tr ac t c r ea m o f ch i cken soupbee f a l m ondka l evan ill avan ill a ex tr ac t evapo r a t ed m il ksou r c r ea m bu tt e r m il kch i cken b r o t hha lf and ha lf m il kb r own suga r bu tt e r honeyapp l esauceo li ve o il sp l enda − . . Va l ue C o l o r K ey ingredient sv d d i m en s i on Figure 13: Influential substitution communities.The matrix shows the most influential feature di-mensions extracted from the substitution network.For each dimension, the six representative ingredi-ents with the highest intensity values are shown,with colors indicating their intensity. These featuressuggest that the communities of ingredient substi-tutes, such as the sweet and oil in the first dimen-sion, are particularly informative in prediction. data. Hence we chose k = 50 for the reduced dimension ofall three networks. The figure also shows that using theinformation about the complement network alone is moreeffective in prediction than using either the co-occurrenceand substitute networks, even in the case of low dimen-sions. Consistently, as shown in terms of relative importance(Fig. 10), the substitution network alone is not the most ef-fective, but it provides more complementary information inthe combined feature set. In Figure 13 we show the most representative ingredientsin the decomposed matrix derived from the substitution net-work. We display the top five influential dimensions, eval-uated based on the relative importance, from the SVD re-sultant matrix V k , and in each of these dimensions we ex-tracted six representative ingredients based on their inten-sities in the dimension (the squared entry values). Theserepresentative ingredients suggest that the communities ofingredient substitutes, such as the sweet and oil substitutesin the first dimension or the milk substitutes in the seconddimesion (which is similar to the cluster shown in Fig. 6),are particularly informative in predicting recipe ratings.To summarize our observations, we find we are able toeffectively predict users’ preference for a recipe, but the pre-diction is not through using a full list of ingredients. Instead,by using the structural information extracted from the re-lationships among ingredients, we can better uncover users’preference about recipes.
7. CONCLUSION
Recipes are little more than instructions for combiningand processing sets of ingredients. Individual cookbooks,even the most expansive ones, contain single recipes for eachdish. The web, however, permits collaborative recipe gen-eration and modification, with tens of thousands of recipescontributed in individual websites. We have shown how thisdata can be used to glean insights about regional preferencesand modifiability of individual ingredients, and also how itcan be used to construct two kinds of networks, one of in-gredient complements, the other of ingredient substitutes.These networks encode which ingredients go well together,and which can be substituted to obtain superior results, andpermit one to predict, given a pair of related recipes, whichone will be more highly rated by users.In future work, we plan to extend ingredient networks toincorporate the cooking methods as well. It would also beof interest to generate region-specific and diet-specific rat-ings, depending on the users’ background and preferences.A whole host of user-interface features could be added forusers who are interacting with recipes, whether the recipeis newly submitted, and hence unrated, or whether they arebrowsing a cookbook. In addition to automatically predict-ing a rating for the recipe, one could flag ingredients thatcan be omitted, ones whose quantity could be tweaked, aswell as suggested additions and substitutions.
8. ACKNOWLEDGMENTS
This work was supported by MURI award FA9550-08-1-0265 from the Air Force Office of Scientific Research. Themethodology used in this paper was developed with sup-port from funding from the Army Research Office, Multi-University Research Initiative on Measuring, Understand-ing, and Responding to Covert Social Networks: Passive andActive Tomography. The authors gratefully acknowledge D.Lazer for support.
9. REFERENCES [1] Ahn, Y., Ahnert, S., Bagrow, J., and Barabasi, A.Flavor network and the principles of food pairing.
Bulletin of the American Physical Society 56 (2011).[2] Cortes, C., and Vapnik, V. Support-vector networks.
Machine learning 20 , 3 (1995), 273–297.3] Forbes, P., and Zhu, M. Content-boosted matrixfactorization for recommender systems: Experimentswith recipe recommendation.
Proceedings ofRecommender Systems (2011).[4] Freyne, J., and Berkovsky, S. Intelligent foodplanning: personalized recipe recommendation. In
IUI ,ACM (2010), 321–324.[5] Friedman, J. Stochastic gradient boosting.
Computational Statistics & Data Analysis 38 , 4(2002), 367–378.[6] Friedman, J., Hastie, T., and Tibshirani, R. Additivelogistic regression: a statistical view of boosting.
Annals of Statistics 28 (1998), 2000.[7] Geleijnse, G., Nachtigall, P., van Kaam, P., andWijgergangs, L. A personalized recipe advice systemto promote healthful choices. In
IUI , ACM (2011),437–438.[8] Hastie, T., Tibshirani, R., Friedman, J., and Franklin,J. The elements of statistical learning: data mining,inference and prediction.
The MathematicalIntelligencer 27 , 2 (2005).[9] Kamieth, F., Braun, A., and Schlehuber, C. Adaptiveimplicit interaction for healthy nutrition and foodintake supervision.
Human-Computer Interaction.Towards Mobile and Intelligent InteractionEnvironments (2011), 205–212.[10] Kinouchi, O., Diez-Garcia, R., Holanda, A.,Zambianchi, P., and Roque, A. The non-equilibriumnature of culinary evolution.
New Journal of Physics10 (2008), 073020.[11] Lu, Y., Peng, F., Li, X., and Ahmed, N. Couplingfeature selection and machine learning methods fornavigational query identification. In
CIKM , ACM(2006), 682–689. [12] Rombauer, I., Becker, M., Becker, E., and Maestro, L.
Joy of cooking . Scribner Book Company, 1997.[13] Rosvall, M., and Bergstrom, C. Maps of random walkson complex networks reveal community structure.
PNAS 105 , 4 (2008), 1118.[14] Shidochi, Y., Takahashi, T., Ide, I., and Murase, H.Finding replaceable materials in cooking recipe textsconsidering characteristic cooking actions. In
Proc. ofthe ACM multimedia 2009 workshop on Multimediafor cooking and eating activities , ACM (2009), 9–14.[15] Svensson, M., H¨o¨ok, K., and C¨oster, R. Designing andevaluating kalas: A social navigation system for foodrecipes.
ACM Transactions on Computer-HumanInteraction (TOCHI) 12 , 3 (2005), 374–400.[16] Ueda, M., Takahata, M., and Nakajima, S. User’s foodpreference extraction for personalized cooking reciperecommendation.
Proc. of the Second Workshop onSemantic Personalized Information Management:Retrieval and Recommendation (2011).[17] Wang, L., Li, Q., Li, N., Dong, G., and Yang, Y.Substructure similarity measurement in chineserecipes. In
WWW , ACM (2008), 979–988.[18] Wikipedia. Outline of food preparation, 2011. [Online;accessed 22-Oct-2011].[19] Zhang, Q., Hu, R., Mac Namee, B., and Delany, S.Back to the future: Knowledge light case base cookery.In