Showing posts with label atlas. Show all posts
Showing posts with label atlas. Show all posts

2019-01-18

Dr Lukas Henrich

It was an honor and a privilege to serve on the PhD defense committee of Lukas Heinrich (NYU), who has had a huge impact on how particle physicists do data analysis. For one, he has designed and built a system that permits re-use of intermediate data results from the ATLAS experiment in new data analyses, measurements, and searches for new physics. For another, he has figured out how to preserve data analyses and workflows in a reproducible framework using containers. For yet another, he has been central in convincing the ATLAS experiment and CERN more generally to adopt standards for the registration and preservation of data analysis components. And if that's not all, he has structured this so that data analyses can be expressed as modular graphs and modified and re-executed.

I'm not worthy! But in addition to all this, Heinrich is a great example of the idea (that I like to say) that principled data analysis lies at the intersection of theory and hardware: His work on ruling out supersymmetric models using ATLAS data requires a mixture of theoretical and engineering skills and knowledge that he has nailed.

The day was a pleasure, and that isn't just the champagne talking. Congratulations Dr. Heinrich!

2018-12-27

long-form writing

I'm spending some time over the break thinking about possible long-form writing projects. I have an Atlas of Galaxies to finish, and I have ideas about possible books on introductory mechanics, and ideas about something on the practice and deep beliefs of scientists. And statistics and data analysis, of course! I kicked those around and wrote a little in a possible mechanics preface.

2016-08-17

probabilistic redshifts

In the morning I had a long and overdue conversation with Alex Malz, who is attempting to determine galaxy one-point statistics given probabilistic photometric redshift information. That is, each galaxy (as in, say, the LSST plan and some SDSS outputs) is given a posterior probability over redshifts rather than a strict redshift determination. How are these responsibly used? It turns out that the answer is not trivial: They have to be incorporated into a hierarchical inference, in which the (often implicit) interim priors used to make the p(z) outputs is replaced by a model for the distribution of galaxies. That requires (a) mathematics of probability, and (b) knowing the interim priors. One big piece of advice or warning we have for current and future surveys is: Don't produce probabilistic redshifts unless you can produce the exact priors too! Some photometric redshift schemes don't even really know what their priors are, and this is death.

In the afternoon, I discussed various projects with John Moustakas (Siena), around Gaia and large galaxies. He mentioned that he is creating a diameter limited catalog and atlas of galaxies. I am very interested in this, but we had to part ways before discussing further.

2014-06-17

bits of writing

I wrote words for my Atlas and also for Adam Bolton (Utah). The latter was a probabilistic generalization of the k-means clustering algorithm that might be able to deliver high-quality quasar archetypes for automated redshift fitting in SDSS-IV.

2014-06-05

rate or rate density?

Foreman-Mackey and I pair-wrote some more in his exoplanet populations paper. We made strict (and explicit) definitions of "rate" and "rate density" and audited the document to be consistent with those definitions. A rate is a dimensionless expectation value for an integer draw from a Poisson distribution. A rate density is something that needs to be integrated over a finite volume in some parameter space to produce a rate. We reminded ourselves that the model is an "inhomogeneous Poisson process" (inhomogeneous because the rate density varies with planet period and radius) and said so where appropriate. We massaged the text around the issues of converting rate estimates from other projects into rate densities to compare with our results. And we finished the figure captions. So close. I also wrote a bit in my own Atlas.

[Added after the fact: Above I am talking about the "rate" of a process inside a discrete population: This is about the rate at which planets host stars. There is another use of "rate" in physics that is number per time; it has to be integrated over a time interval to get a dimensionless number. The words "rate" and "frequency" both have these double meanings of either dimensionless object (in discrete probability contexts) or else number per time (in time-domain physics contexts).]

2014-05-29

calibration problems as correlated noise; shear mapping

In addition to writing a small amount in my Atlas today, I read a few papers and parts of papers. One was an appendix by Michael Cushing (Toledo) about fitting sets of data that include both photometry and spectroscopy with calibration issues. Cushing treats the calibration issues as a form of exceedingly correlated noise. This is very related to things we worked on at #UnDisLo last month, and he has some very good ideas about how to structure the problem. Some of the ideas are related to things Cushing and I talked about when I was visiting Toledo last year.

Another read was a paper proposal by Michael Schneider (LLNL) about doing weak lensing by fitting a Gaussian Process model of the three-dimensional shear map to the data, and then doing cosmological analysis on the samples or posterior for that shear map. This latter idea grew out of discussions in the MBI team that entered the weak-lensing GREAT3 Challenge. The MBI team is trying to think as probabilistically as our computational resources allow.

2014-05-27

writing

I wrote text in my Atlas technical description. That is all.

2014-03-18

all projects move one tiny step forward

I worked on code to make figures to go in the captions of the Sloan Atlas. I discussed reincarnating The Thresher with Patel and Federica Bianco (NYU), who wants to use it on her Lucky Imaging data. Vakili and I asked Foreman-Mackey to teach Vakili how to access and use SDSS Stripe 82 data on stars, to build a prior PDF for the point-spread function. I wrote text on my ideas for improving aperture photometry. I got so engrossed in the latter writing that I missed my subway stop on my way up to Columbia and ended up having to walk 17 blocks back downtown for #NYCastroML.

2014-02-17

pile of stuff, assembled and shipped

I spent the day assembling my zeroth draft material for my Atlas together into one file, including plates, captions, and some half-written text. It is a mess, but it is in one file. All the galaxies are shown at the same plate scale and same exposure, calibration, and stretch. One of the hardest problems to solve (and I solved it boringly) is how to split up the page area into multiple panels (all at same plate scale) to show the full extents of all the galaxies without too much waste. Another hard problem was going through the data for the millionth time, looking at outliers and understanding what's wrong in each case. It is a mess, but as I am writing this I am uploading to the web to deliver it to my editor (Gnerlich at Princeton University Press).

2014-02-15

under the gun

I worked all day trying to get a zeroth draft of all the plates for my Atlas together for delivery to my editor; I have a deadline today. I got a set of plates together, but I couldn't get it assembled with captions and the partially written text I have into one big document. That is, I failed. I will have to finish on Monday.

2014-02-14

sick kid = research

I had a full day hiding at home and working; I spent it on my Atlas. I got multi-galaxy plates close to fully working and worked on the automatic caption generation. On the multi-galaxy plate issue, one problem is deciding how big to make each image: Galaxies scaled to the same half-light or 90-percent radius look very different when presented at the same exposure time, brightness, and contrast (stretch). One of the points of my Atlas is to present everything in a quantitatively comparable way, so this is a highly relevant issue.

2014-02-12

intensity vs flux bug

I spent some quality time with Ekta Patel tracking down a bug in our visualization of output from The Tractor. In the end it turned out to be a think-o (as many hard-to-find bugs are) in which I had put in some calibration information as if it calibrated flux, when in fact it calibrates intensity. The flux vs intensity issues have got me many times previously, so I might learn it some day. As my loyal reader knows (from this and this, for example) I feel very strongly that an astronomical image is a measure of intensity not flux! If you don't know what I mean by that, probably it doesn't matter, but the key idea is that intensity is the thing that is preserved by transparent optics; it is the fundamental quantity.

2014-02-05

weather = research

My trip to Penn State got cancelled for weather, which is bad, not least because Eric Ford (PSU) and I have lots to discuss about getting our exoplanet characterization and discovery software programs funded, and not least because there are many people there with whom I was looking forward to valuable conversations. That said, I just got two days added into my schedule in which I had cancelled all regular meetings. I spent my time today working on my Sloan Atlas of Galaxies project, making mock-ups of some of the plates to check that my sizing, spacing, borders, and so on all make sense.

2014-01-22

weak lensing

I had two telecons today. The first was with Phil Marshall, to discuss my Atlas project, which needs some love and attention. Phil had lots of good ideas for improvements that make it more fun and more useful. In the second I joined a meeting with Marshall's GREAT3 team, which is getting ready to compete in the upcoming weak lensing competition. The team includes Lang, who is operating the farm machinery required. We discussed the importance of having an explicit (and good) prior over galaxy shapes, something that team members Schneider (LLNL) and Dawson (LLNL) are working on at the theory level. We also discussed how to parameterize ellipticity. My position is: If you are working at catalog level (which might be a mistake), you want to work with the point estimates (catalog entries) that are closest to having a Gaussian likelihood! This, if you trace it down, ends up being a statement about ellipticity parameterization. All that said, I expect that all methods working at catalog level are (in the end) doomed to failure. The only things I can see working at catalog level are actually more computationally intensive than working at image (pixel) level.

2013-10-02

exoplanet search, large galaxies

I worked on large-galaxy photometry with Patel for part of the day. She is dealing with all our problem cases; we have good photometry for almost any large, isolated galaxy, but something of a mess for some of the cases of overlapping or merging galaxies. Not surprising, but challenging. I am also working on how to present the results: What we find is that with simple, (fairly) rigid galaxy models we get excellent photometry. How to explain that, when the models are too rigid to be "good fits" to the data? It has to do with the fact that you don't have to have a good model to make a good photometric measurement, and the fact that simple models are "interpretable".

In the afternoon, we had a breakthrough in which we realized that Foreman-Mackey's exoplanet search (which he sped up by a factor of 104 on the weekend with sparse linear algebra and code tricks) can be sped up by another large factor by separating it into single-transit hypothesis tests and then hypothesis tests that link the single transits into periodic sets of transits. He may try to implement that tomorrow.

2013-07-30

does the fine-structure constant vary with cosmic time?

For the last two days, NYU undergrads Ekta Patel and David Mykytyn have been doing great work on a range of projects. Patel has been working on visualizing and vetting the galaxies that make up the Sloan Atlas of Galaxies. This takes serious judgement and patience! Mykytyn has been quantitatively comparing the g and r fluxes of time-variable quasars, on a path to making a statistical model for arbitrarily heterogeneous multi-band, multi-epoch quasar photometry.

This morning, Jonathan Whitmore (Swinburne) gave a nice talk about monitoring the fine-structure constant (alpha) as a function of cosmic time using metal-line absorption in quasar spectra. He finds no signal (contrary to some claims in the literature) and has a plausible explanation for some weak claims: The detailed wavelength calibration of the echelle spectrograph he is using seems to be different for the arc calibrations and for the gas-cell (or solar analog) spectra. This might be because the arc illuminates the spectrograph very slightly differently than the astronomical source. He has to build a working model of all this if he wants to improve the precision of the experiments, which all seem to be limited by these kinds of issues at the present day.

It might all sound crazy, but in fact most current models of inflation or the vacuum energy density do generically predict fundamental-constant variations. They aren't that specific about magnitudes of fluctuations within the Hubble Time, but the predictions are there qualitatively.

2013-07-24

undergraduate research

David Mykytyn and Ekta Patel, NYU undergraduates, showed up today for three weeks of sprint in Heidelberg. In addition to whatever Rix and I throw at them, they are working on GALEX calibration and the Sloan Atlas. Mykytyn worked on reformatting some of our GALEX Python code to make it more function based and less script-like. Patel worked on splitting the Sloan Atlas galaxy sample into subsamples by color and intensity.

2013-07-19

Atlas and proposal

Spent the day (in bed, for uninteresting reasons) working on my Atlas and NYU's Moore–Sloan proposal.

2013-04-01

data science

Today we prepared for a site visit by the Sloan and Moore Foundations, who are doing research related to data science and ways in which they could support it in the universities. People from all over the university—many of whom I know well because of our overlapping interests in extracting science from data—got together to figure out what things we want to discuss with the visiting team. One thing I was reminded of in the discussions is that MCMC is a truly cross-cutting tool for data science; it finds a use in every discipline. That makes me even more excited about our various MCMC ideas. Late in the day I worked on my Sloan Atlas of Galaxies. The Sloan Foundation has been pretty important in my scientific life!

2013-03-13

writing, robots

On the subway up to an undisclosed location in the West 70s, I wrote text for my Atlas of Galaxies. At the location, Schiminovich and I discussed next steps in the GALEX Photon Catalog project. We also discussed a proposal that Schiminovich is writing for a balloon-borne instrument that deals with variable conditions by making non-trivial decisions in real time in response to real-time measurements of transparency, sky brightness, and field-of-view (which can be hard to control on balloons). Obviously, a custom-built on-board lightweight implementation of Astrometry.net would be useful!