Re: what affects number of reducers launched by hadoop?
Joe Stein
Wed, 28 Jul 2010 12:41:00 -0700
mapred.tasktracker.reduce.tasks.maximum is how many you want as a ceiling
per node
you need to configure *mapred.reduce.tasks* to be more than one as it is
defaulted to 1 (which you are overriding in your code which is why it works
there)
This value should be somewhere between .95 and 1.75 times the number of
maximum tasks per node times the number of data nodes.
So if you have 3 data nodes and it is setup max tasks of 7 then configure
this between 25 and 36
On Wed, Jul 28, 2010 at 3:24 PM, Vitaliy Semochkinwrote:
> Hi,
>
> in my cluster mapred.tasktracker.reduce.tasks.maximum = 4
> however during monitoring the job in job tracker I see only 1 reducer
> working
>
> first it is
> reduce > copy - can someone please explain what does this mean?
>
> after it is
> reduce > reduce
>
> when I set the number of reduce tasks for a job programatically to 10
> job.setNumReduceTasks(10);
> the number of "reduce > reduce" reducers increases to 10 and the
> performance of application increases as well (the number of reducers
> never exceeds).
>
> Can someone explain such behavior?
>
> Thanks in Advance,
> Vitaliy S
>
No comments:
Post a Comment