Kelp Talk

How precise should the kelp bed outlining be?

  • robosapien by robosapien

    Is data from the area selected used or should we just try to include as much of the kelp bed as possible. Do other people analyze the images later for the size of the kelp beds?

    Posted

  • Quia by Quia

    The field guide gives two nice examples of how to mark the kelp.

    Your classification of each image is combined with several other people, combining the estimations of multiple people means in the final analysis it's possible to tell where there is 'definitely kelp', 'probably kelp', and 'maybe some faint kelp.'

    Posted

  • robosapien by robosapien

    Thanks. My question relates more to how accurate we should be in delineating the exact edges of the kelp bed. Does that matter at all or is it just to delineate the general area for the investigators to look at?

    Posted

  • Artman40 by Artman40

    Does the estimation take account on how many times the users have classified pictures beforehand?

    Posted

  • Quia by Quia

    Paging @jebyrnes for some more definitive answers from the science team. 😃

    I can make some informed guesses, though.

    @robosapien Given that the input method is free drawing, we're clearly not expected to be pixel perfect. I err on the side of selecting all the kelp vs selecting no open water, but my method is not more 'correct' than trying to exactly mark the out borders of the kelp, and missing a few bits.

    @artman40 The answer for this one is 'if it improves the quality of the dataset!' This really depends on the project, and I don't know that this analysis has been done on any of the finished data from Floating Forests yet, so this is all general citizen science-zooniverse related. People can get better at classifying the data as they go on, and using that information means better data out of the project, so the final analysis can use a 'weighted' dataset that gives more importance to users based on things like, number of classifications, agreement with a small subset of expert classified subjects, and so on. On all the papers coming out of the Zooniverse projects there is almost always a section on user weighting.

    Posted

  • jebyrnes by jebyrnes scientist

    Hello! We want y'all to be as precise as you can be, but don't kill yourselves. We're going with consensus identification of kelp canopies - i.e., of the 8 people who see this image, for each pixel, we'll count it as kelp if >6 included it in their circle. So, deliniating the outside edges is ideal. The Zooniverse team will be giving us back both heatmaps as well as raw data with coordinate information that we can apply to each area. Once we have that information, our next step (planned) is to feed it back to the software we've been using to estimate biomass, and look at colors of individual pixels and hence estimate biomass. So, extreme precision wouldn't be a problem there, as we're getting colors anyway.

    With respect to canopy cover, again, while high precision is extremely helpful, because we're going with consensus choices, this will help even out some of the problems. Given that we're adding up over much much larger geographic areas, some of the lack of precision will come out as measurement error. We fortunately have a high-quality canopy map for Santa Barbara and a few other parts of California, so we'll be able to calibrate that measurement error and build it into analyses.

    Last, for looking at range limits, accuracy shouldn't be a problem. Presence/abscence is more the issue.

    So, in sum, be as accurate as you can! Erring on the side of selecting all of the kelp is likely to produce better long-term results! But fear not! We'll be using consensus maps to help improve accuracy.

    Posted

  • Quia by Quia

    Thanks for sharing all the gritty details, much appreciated! Is the 6 out of 8 threshold based on analysis of initial data? A 3/4 majority seems like it would be an incredibly super clean sample that might leave out a lot of the fainter kelp beds... I wonder, not having seen any of the data you've based it on. Maybe we're just better as classifiers than I'm giving us credit for!

    Posted

  • jebyrnes by jebyrnes scientist

    The 6/8 is a first stab. Fortunately, as I mentioned, we have a good chunk of CA done with expert classifiers (see Kyle Cavanaugh's paper at http://www.int-res.com/abstracts/meps/v429/p1-17/ ) so we're going to run the data against that first to establish the minimum threshold. 6/8 might be too high, you're right. But from working with students, so far, people seem to be surprisingly good at this. But more soon, once we start cracking on the data.

    Posted

  • yshish by yshish

    Please, what about the kelp area I mark which is overlapping another kelp area. Does it count the overlapped parts as one or as two? At large kelp forests I'm not able to mark it whole at once since I use only a small touchpad now and it would be much more easy for me if I could overlap them and don't worry about it:]

    Hope that's understandable.

    Thanks. Zuzi

    Posted

  • Artman40 by Artman40

    What about tiny patches of kelp?

    Posted

  • jebyrnes by jebyrnes scientist

    If you see 'em, circle 'em! Remember, for some parts of California, we also have expert classifications, so we'll be comparing your classifications in CA to them. This will at least tell us how much agreement we need to get to match up with our expert classifiers. We may also use that information in other ways in the future - haven't reached that stage of analysis yet.

    Posted

  • yshish by yshish in response to jebyrnes's comment.

    And what about the overlapping I have asked in the previous post?😃

    Thanks!

    Zuzi

    Posted

  • DZM by DZM admin in response to jebyrnes's comment.

    That's a good thing to know. That serves the purpose that simulations do in the other projects... having expert-classified "gold-standard" data to measure the accuracy of volunteers' work.

    Thanks for letting us know!

    Posted

  • yshish by yshish in response to yshish's comment.

    Please, could someone explain to me, whether the overlapping areas are counted twice or just once? I'm afraid of classifying since I don't want to make mistakes 😦

    Thank you:)

    Posted

  • DZM by DZM admin in response to yshish's comment.

    A good question. (Just for reference, here's the original question):

    Please, what about the kelp area I mark which is overlapping another kelp area. Does it count the overlapped parts as one or as two? At large kelp forests I'm not able to mark it whole at once since I use only a small touchpad now and it would be much more easy for me if I could overlap them and don't worry about it:]

    I imagine that overlaps are probably not a problem, but let's get a formal answer from @jebyrnes ... or, as a backup, I could see what one of the devs who built the project might have to say. Let's see if we can get @jebyrnes or another scientist's opinion first!

    Posted

  • jebyrnes by jebyrnes scientist

    We were just on a call to talk about data parsing, so I'm so glad you asked! Overlaps are not a problem. Our current plan is to use the sp and rgdal libraries in R to make polygon shapefiles for each user's classifications of an image. We'll then calculate the area from that. We're also going to take all of the shapefiles for a given image, and then overlay them to generate heatmaps. Once we have the first one together, I'll blog it.

    So - no worries at all on overlap! It should all come out in the wash!

    Posted

  • yshish by yshish in response to jebyrnes's comment.

    Oh, great, that's perfect! It makes the classification much more easy when I don't have to care about overlapping 😃

    Thanks you for your reply!

    Zuzi

    Posted